A new benchmark called DALPHIN has been developed to evaluate AI copilots in digital pathology. The benchmark includes over 1200 images and a performance comparison with 31 human pathologists. General-purpose models like GPT-5 and Gemini 2.5 Pro, along with a specialized copilot, PathChat+, were tested on various diagnostic tasks. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT Establishes a new standard for evaluating AI's diagnostic capabilities in a specialized medical field, potentially guiding future development and adoption.
RANK_REASON The cluster describes a new academic paper introducing a benchmark dataset and evaluation methodology for AI in digital pathology.