Probing LLM Counterfactual Reasoning in Game Theory

Γεωργούσης, Δημήτριος

National Technical University of Athens

School of Electrical and Computer Engineering

Artemis is Live!

Welcome to our digital repository! The aim of Artemis is the systematic archiving and dissemination of the scientific work produced in the School of Electrical and Computer Engineering, National Technical University of Athens, Greece, using the technology of digital libraries.

Please use this identifier to cite or link to this item: http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19878

Title:	Probing LLM Counterfactual Reasoning in Game Theory
Authors:	Γεωργούσης, Δημήτριος Στάμου Γιώργος
Keywords:	reasoning Large Language Models LLM prompting game theory Nash equilibrium strategy counterfactual adaptability
Issue Date:	30-Oct-2025
Abstract:	Large Language Models (LLMs) have emerged as versatile agents capable of addressing a wide range of tasks, including strategic reasoning. This thesis investigates whether LLMs exhibit true strategic reasoning in game-theoretic environments. The main focus of this work is the study of repeated variants of simple games, leveraging their ease of parameterization and imploring various prompting techniques. Simultaneous-move, symmetric games are used in experimentation- Prisoner’s Dilemma, Stag Hunt, and Rock-Paper-Scissors- which offer parameterization opportunities by adjusting both naming schemes of moves offered to players and payoffs of said moves. LLMs are likely to be aware of only the typical or usual setting of each game; therefore, counterfactual settings (settings created from parameter modification) serve as a test of LLM flexibility and sensitivity to changes in payoff structure, and as juxtaposition of strategic thinking and reliance to prior knowledge, that LLMs might have on the default setting of the games. A well-known method for targeting LLM thinking abilities towards specific tasks is the employment of advanced prompting techniques. In this work, a range of prompting strategies, including Zero-Shot, Chain-of-Thought, and Solo-Performance Prompting, are used; experiments are also performed on their Self-Consistency counterparts. These techniques reflect an attempt to elicit more deliberate and context-aware responses from the models. They aim to minimize the influence of surface-level pattern matching and instead encourage reasoning that takes into account the specific parameters of each game instance. To evaluate the presence of strategic reasoning, LLMs are compared against non-AI players, who follow specific preset strategies, and against themselves following different prompt styles. This comparative framework allows for an assessment of whether LLMs adapt their play in a manner consistent with rational strategic behavior, or if their responses merely reflect superficial cues from the prompt. Key indicators include responsiveness to opponent strategy, exploitation of opponent tendencies, and behavioral shifts across repeated rounds. In particular, repeated interactions offer a unique window into whether LLMs can exhibit conditional cooperation, retaliatory strategies, or learning-like behavior over time. By systematically varying both the game settings and the prompting techniques, this thesis aims to uncover the conditions under which LLMs demonstrate behavior indicative of genuine strategic reasoning. The findings contribute to the broader understanding of LLM capabilities, especially in dynamic decision-making contexts, and highlight both the promise and limitations of current models in replicating human-like strategic thought.
URI:	http://artemis.cslab.ece.ntua.gr:8080/jspui/handle/123456789/19878
Appears in Collections:	Διπλωματικές Εργασίες - Theses

Files in This Item:

File	Description	Size	Format
D_Georgousis_Diploma_Thesis.pdf		3 MB	Adobe PDF	View/Open

Show full item record