GPQA
PulseAugur coverage of GPQA — every cluster mentioning GPQA across labs, papers, and developer communities, ranked by signal.
-
DataMaster framework automates ML data engineering for improved model performance
Researchers have developed DataMaster, a novel framework designed to automate the data engineering process for machine learning. This system aims to improve ML model performance by optimizing data selection, composition…
-
FocuSFT improves LLM long-context understanding via bilevel optimization
Researchers have developed FocuSFT, a novel bilevel optimization framework designed to improve how large language models handle long contexts. This method addresses the issue of "attention dilution," where models tend t…
-
New Metacognitive Probe assesses LLM confidence and self-awareness
Researchers have developed a new diagnostic tool called the Metacognitive Probe to assess how well Large Language Models (LLMs) understand their own confidence levels. This five-task probe decomposes an LLM's confidence…