LLMs compute Nash equilibrium but suppress it via final-layer overrides

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have investigated why large language models (LLMs) deviate from Nash equilibrium play in strategic interactions. By examining open-source models like Llama-3 and Qwen2.5, they found that while opponent history is well-encoded, the Nash action itself is weakly represented. A prosocial override in the final layers of the model appears to suppress the Nash action, leading to cooperative behavior. Interestingly, chain-of-thought reasoning improves Nash play in larger models (above 70B parameters) but degrades it in smaller ones. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Investigates LLM decision-making in strategic games, potentially impacting agent design and alignment research.

RANK_REASON Academic paper detailing mechanistic findings and causal control experiments on LLM behavior in strategic games.

Read on arXiv cs.AI →

paper
safety

COVERAGE [1]

arXiv cs.AI TIER_1 · Paraskevas V. Lekeas, Giorgos Stamatopoulos · 2026-05-01 04:00

What Suppresses Nash Equilibrium Play in Large Language Models? Mechanistic Evidence and Causal Control

arXiv:2604.27167v1 Announce Type: cross Abstract: LLM agents are known to deviate from Nash equilibria in strategic interactions, but nobody has looked inside the model to understand why, or asked whether the deviation can be reversed. We do both. Working with four open-source mo…

COVERAGE [1]

What Suppresses Nash Equilibrium Play in Large Language Models? Mechanistic Evidence and Causal Control

RELATED ENTITIES

RELATED TOPICS