A new evaluation framework, "Cited but Not Verified," has been developed to assess the source attribution capabilities of large language models (LLMs) used in research agents. This framework parses and evaluates inline citations from LLM-generated reports across three dimensions: link accessibility, content relevance, and factual accuracy. Benchmarking 14 LLMs revealed that while frontier models maintain high link validity and relevance, their factual accuracy in citations is significantly lower, especially as the depth of retrieval increases. Separately, a new file format called ObjectGraph (.og) has been proposed to address the inefficiencies of current document handling by LLM agents, reconceiving documents as traversable knowledge graphs rather than linear text. AI
Summary written by gemini-2.5-flash-lite from 4 sources. How we write summaries →
IMPACT New evaluation frameworks and file formats are emerging to improve the reliability and efficiency of LLM agents in research and information synthesis.
RANK_REASON The cluster contains two academic papers detailing new frameworks and file formats for LLM research.