Frontier AI models score C+ on expert trust, study finds

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

A new study by expert network Pearl found that top frontier AI models like GPT-5.5 and Claude Opus 4.7 are only marginally better than older versions, with none exceeding a C+ grade. The models struggled with professional judgment, failing to prioritize or escalate when necessary, despite claims of superior intelligence. While some domains like business saw higher scores, law and health domains showed dangerously low expert alignment, suggesting human oversight remains critical. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT New research suggests current frontier AI models still lack the professional judgment and trustworthiness required for critical applications, indicating a need for continued human oversight.

RANK_REASON The cluster analyzes and critiques the performance of existing AI models based on a third-party study, rather than announcing a new release or significant industry event.

Read on Forbes — Innovation →

Frontier AI models score C+ on expert trust, study finds

COVERAGE [1]

Forbes — Innovation TIER_1 · John Koetsier, Senior Contributor · 2026-05-20 19:10

Top Frontier AI Models Top Out At C+ ... Barely Better Than Old Models

AI is pretty smart, but not as smart as actual experts, according to a new study ...

COVERAGE [1]

Top Frontier AI Models Top Out At C+ ... Barely Better Than Old Models

RELATED ENTITIES

RELATED TOPICS