AI shouldn't automate peer review without rigorous evaluation, paper argues

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

A new paper argues against the immediate automation of academic peer review using current large language models. The research highlights two major issues: AI reviewers exhibit an excessive agreement, limiting diverse perspectives, and their scores can be easily manipulated through stylistic paper rewrites rather than genuine scientific merit. The authors propose that a dedicated science of peer review automation is necessary, rather than deploying general-purpose LLMs without thorough evaluation. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Current LLMs are not suitable for automating peer review due to lack of diversity and susceptibility to manipulation, necessitating specialized research.

RANK_REASON Academic paper evaluating the use of LLMs in peer review. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

COVERAGE [1]

arXiv cs.AI TIER_1 · Joachim Baumann, Jiaxin Pei, Sanmi Koyejo, Dirk Hovy · 2026-05-07 04:00

Stop Automating Peer Review Without Rigorous Evaluation

arXiv:2605.03202v1 Announce Type: new Abstract: Large language models offer a tempting solution to address the peer review crisis. This position paper argues that today's AI systems should not be used to produce paper reviews. We ground this position in an empirical comparison of…

COVERAGE [1]

Stop Automating Peer Review Without Rigorous Evaluation

RELATED ENTITIES

RELATED TOPICS