GKnow: Measuring the Entanglement of Gender Bias and Factual Gender
Researchers have developed GKnow, a new benchmark designed to measure both factual gender knowledge and gender bias in language models. This benchmark aims to disentangle stereotypical outputs from factually gendered ones, which are often conflated in current analyses. Experiments using GKnow revealed that factual gender knowledge and gender bias are deeply intertwined at both the circuit and neuron levels within models, suggesting that simple ablation techniques may be ineffective for debiasing and can even mask a loss of factual gender knowledge. AI
IMPACT Introduces a new evaluation tool to better understand and potentially mitigate gender bias in AI models.