PulseAugur
LIVE 06:14:21
ENTITY GPQA Diamond

GPQA Diamond

PulseAugur coverage of GPQA Diamond — every cluster mentioning GPQA Diamond across labs, papers, and developer communities, ranked by signal.

Total · 30d
4
4 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
4
4 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 8 TOTAL
  1. RESEARCH · CL_21935 ·

    Apple's RVPO framework enhances LLM alignment by penalizing reward variance

    Researchers have introduced Reward-Variance Policy Optimization (RVPO), a novel framework designed to improve the alignment of large language models with multiple objectives. Unlike existing methods that average rewards…

  2. COMMENTARY · CL_20705 ·

    AI models: Choose benchmarks over hype for true performance

    A recent analysis highlights that tech companies often select AI models based on hype rather than performance on relevant benchmarks. The article emphasizes that benchmarks like SWE-bench for coding, Terminal-Bench for …

  3. TOOL · CL_20624 ·

    New fine-tuning method boosts LLM knowledge injection without paraphrasing

    Researchers have developed a new fine-tuning method called Diffusion-Inspired Masked Fine-Tuning (DMT) for autoregressive large language models (LLMs). This technique aims to improve the injection of factual knowledge i…

  4. RESEARCH · CL_14447 ·

    New method enhances LLM reasoning diversity without sacrificing stability

    Researchers have introduced Expert-Sample, a novel training-free method designed to enhance the performance of fine-grained Mixture-of-Experts (MoE) models. This technique addresses the trade-off between diversity and s…

  5. RESEARCH · CL_14144 ·

    State Stream Transformer V2 enhances LLM reasoning with parallel training and latent state streaming

    Researchers have developed the State Stream Transformer (SST) V2, an architectural innovation designed to enhance latent space reasoning in language models. Unlike standard transformers that reset context at each step, …

  6. RESEARCH · CL_03564 ·

    FINAL-Bench/Darwin-36B-Opus · Hugging Face

    The Darwin-36B-Opus model, a 36-billion-parameter mixture-of-experts language model, has been released. It was created using the Darwin V7 evolutionary breeding engine, combining aspects of Qwen/Qwen3.6-35B-A3B and a Cl…

  7. RESEARCH · CL_02960 ·

    Process Supervision via Verbal Critique Improves Reasoning in Large Language Models

    Researchers have developed a new framework called Verbal Process Supervision (VPS) that enhances the reasoning capabilities of large language models without requiring gradient updates. This method utilizes structured na…

  8. FRONTIER RELEASE · CL_02231 ·

    OpenAI's GPT-5.2 advances science and math, with evaluations showing low catastrophic risk

    OpenAI has released GPT-5.2, a new model demonstrating significant advancements in mathematical and scientific reasoning. The model achieved high scores on benchmarks like GPQA Diamond and FrontierMath, indicating impro…