A new audit called EQUITRIAGE evaluated five large language models for gender bias in emergency department triage, finding that all models exhibited bias above a 5% threshold. DeepSeek-V3.1 and Gemini-3-Flash showed significant directional female undertriage, with flip rates ranging from 9.9% to 43.8%. While demographic blinding reduced Gemini's bias, DeepSeek still showed residual bias, suggesting age as a contributing factor. The study highlights that different models have distinct underlying mechanisms for bias and emphasizes the need for per-model auditing before clinical deployment. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT Highlights the critical need for rigorous bias auditing in LLMs before deployment in sensitive applications like healthcare.
RANK_REASON The cluster contains an academic paper detailing a fairness audit of LLMs.