AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in…

By PulseAugur Editorial · Summary by None from 1 source

Researchers have introduced AIPsy-Affect, a new 480-item stimulus battery designed to improve the mechanistic interpretability of emotion in language models. This battery removes the confound of emotion-specific keywords by using narrative situations to evoke emotions, ensuring that model responses are due to genuine affective understanding rather than keyword detection. The dataset includes keyword-free vignettes, matched neutral controls, and variations for intensity and discriminant validity, aiming to provide a stronger methodological guarantee for interpretability research. AIPsy-Affect is an expansion of a previous, smaller battery and is available under an MIT license. AI

Summary written by None from 1 source. How we write summaries →

IMPACT Enables more rigorous evaluation of emotion understanding in LLMs, potentially leading to more robust affective AI systems.

RANK_REASON Release of a new, open-source dataset for AI interpretability research.

Read on arXiv cs.CL →

paper
safety

COVERAGE [1]

arXiv cs.CL TIER_1 · Michael Keeman · 2026-04-28 04:00

AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in Language Models

arXiv:2604.23719v1 Announce Type: new Abstract: Mechanistic interpretability research on emotion in large language models -- linear probing, activation patching, sparse autoencoder (SAE) feature analysis, causal ablation, steering vector extraction -- depends on stimuli that cont…

COVERAGE [1]

AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in Language Models

RELATED ENTITIES

RELATED TOPICS