AI for Game Theory: Computational Game Playing, Nash Equilibrium, and Multi-Agent Strategy

## TL;DR
AI has mastered games through different strategic paradigms: perfect information (chess, Go -- AlphaZero, 2017-2018), imperfect information (poker -- Libratus/Pluribus, 2017-2019), and the hardest frontier -- games combining imperfect information, multi-agent negotiation, and natural language communication (DeepNash for Stratego, Cicero for Diplomacy). Each milestone has advanced computational game theory and multi-agent strategy.

## Core Explanation
Game theory in AI: (1) Perfect information games -- all players see the full state. Solved via game tree search (MCTS) + neural value/policy networks (AlphaZero). Chess, Go, shogi; (2) Imperfect information -- players have private information (poker hands). Solved via counterfactual regret minimization (CFR) and its neural variants (DeepStack, Pluribus). The solution concept is Nash equilibrium -- strategies where no player can benefit by unilaterally changing; (3) General-sum multi-agent -- players have partially aligned and conflicting interests (Diplomacy). Requires negotiation, communication, and equilibrium reasoning. Solution concepts beyond Nash: correlated equilibrium, coarse correlated equilibrium.

## Detailed Analysis
DeepNash (Stratego): the game board is 10x10 with 40 pieces per player. Each piece has a rank (1-10, bombs, flag) hidden from the opponent -- 10^535 game states, exceeding chess (10^44) by far. DeepNash architecture: actor-critic with R-NaD -- the agent plays against itself millions of times, and the learning algorithm converges toward Nash equilibrium through regularized dynamics (entropy regularization preventing premature convergence to deterministic strategies). No search -- purely model-free, running at inference time on a single CPU after training. Cicero (Diplomacy): 7 players on a map of Europe. Players move armies/fleets between territories; success requires negotiation. Cicero's strategic module uses piKL (policy iteration with KL regularization) to plan optimal moves given predicted opponent actions. The dialogue module generates natural language messages: proposing alliances, threatening, deceiving -- all strategically grounded in the plan. arxiv 2026 survey on generalist game players: unifying game AI through foundation models pretrained on thousands of games, adapting to novel games via in-context learning.