PulseAugur
LIVE 08:31:44
ENTITY GPT-5

GPT-5

PulseAugur coverage of GPT-5 — every cluster mentioning GPT-5 across labs, papers, and developer communities, ranked by signal.

Total · 30d
386
386 over 90d
Releases · 30d
8
8 over 90d
Papers · 30d
158
158 over 90d
TIER MIX · 90D
RELATIONSHIPS
TIMELINE
  1. 2025-08-07 product_launch OpenAI launched GPT-5, its latest AI model, offering enhanced capabilities for businesses. source
SENTIMENT · 30D

5 day(s) with sentiment data

RECENT · PAGE 3/4 · 80 TOTAL
  1. RESEARCH · CL_11531 ·

    Physical Foundation Models: Fixed hardware implementations of large-scale neural networks

    Researchers have proposed a new concept called Physical Foundation Models (PFMs), which involve implementing large neural networks directly into the physical design of hardware. This approach aims to achieve significant…

  2. RESEARCH · CL_11510 ·

    Frontier VLMs fail medical VQA tests due to poor grounding and confusion

    A new paper evaluates five leading vision-language models (VLMs) on their trustworthiness for medical visual question answering (VQA). The study found significant limitations in the models' ability to accurately localiz…

  3. RESEARCH · CL_09952 ·

    OpenAI details 'goblin' outputs and fixes in GPT-5 behavior

    OpenAI has detailed the origin of "goblin" outputs, a phenomenon where AI models exhibit personality-driven quirks. These behaviors stem from the models' training data, specifically from a small subset of text that was …

  4. COMMENTARY · CL_09898 ·

    AI and LLM terminology is poorly defined and frequently misused, essay argues

    The author argues that current AI terminology is poorly defined and frequently misused, leading to confusion. The widespread adoption of terms like 'AI' and 'LLM' has outpaced their precise technical definitions, partly…

  5. RESEARCH · CL_09517 ·

    Google's ERA tool accelerates scientific discovery in public health and cosmology

    Google Research scientists are leveraging a new AI tool called Empirical Research Assistance (ERA) to accelerate scientific discovery across various fields. ERA has been used to generate expert-level empirical software,…

  6. RESEARCH · CL_09950 ·

    OpenAI details how 'goblin' outputs spread in GPT-5 and how they are fixed

    OpenAI has detailed the origins of "goblin" outputs, a phenomenon where AI models exhibit personality-driven quirks. These behaviors stem from the models' training data and can spread through interactions, leading to un…

  7. FRONTIER RELEASE · CL_08801 ·

    DeepSeek R2 ships 32B model, rivals GPT-5 on reasoning at lower cost

    DeepSeek has released its R2 model, a 32 billion parameter dense transformer. This new model achieves 92.7% accuracy on the AIME 2025 benchmark and can operate on a single RTX 4090 graphics card. The R2 model is also si…

  8. RESEARCH · CL_09820 ·

    New framework benchmarks enterprise AI document processing pipelines

    Researchers have developed EnterpriseDocBench, a new framework for evaluating the end-to-end performance of enterprise AI document processing pipelines. The framework assesses parsing fidelity, indexing efficiency, retr…

  9. RESEARCH · CL_06636 ·

    MTRouter cuts LLM costs by 58% on ScienceWorld, 43% on HLE

    Researchers have developed MTRouter, a novel system designed to optimize the cost of multi-turn interactions with large language models. By jointly embedding interaction history and candidate models, MTRouter learns to …

  10. RESEARCH · CL_07024 ·

    New CLIN-LLM framework enhances clinical diagnosis and treatment generation with safety constraints

    Researchers have developed CLIN-LLM, a novel hybrid framework designed to improve clinical diagnosis and treatment generation while prioritizing safety. This system integrates multimodal patient data, uncertainty-calibr…

  11. RESEARCH · CL_06186 ·

    VLMs tackle visual illusions, spatial reasoning, and evaluation benchmarks

    Researchers are developing new methods to improve the robustness and reasoning capabilities of Vision-Language Models (VLMs). One approach, Structured Qualitative Inference (SQI), aims to mitigate visual illusions by en…

  12. RESEARCH · CL_06282 ·

    New PsyGAT model achieves SOTA in depression detection, outperforming GPT-5

    Researchers have developed PsyGAT, a novel graph-based framework for detecting depression from conversational data. This model addresses data scarcity and interpretability issues common in existing deep learning approac…

  13. RESEARCH · CL_14197 ·

    New research probes LLM reasoning and reveals novel jailbreaking vulnerabilities

    Researchers have developed a new method to jailbreak large language models by exploiting their safe completion mechanisms through deceptive multi-turn conversations. This technique, termed intention deception, gradually…

  14. TOOL · CL_14731 ·

    AI tools convert PDFs to podcasts and integrate multiple models

    A new tool has been developed that can convert PDF documents into audio podcasts in nine Indian languages, utilizing AI for text-to-speech generation. Separately, a platform has emerged that integrates multiple AI model…

  15. RESEARCH · CL_04970 ·

    LLMs struggle to detect culturally specific health misinformation on YouTube

    Two new research papers explore the limitations of Large Language Models (LLMs) in detecting culturally specific health misinformation, particularly concerning the promotion of cow urine as a remedy on YouTube in India.…

  16. RESEARCH · CL_05034 ·

    New research suggests LLM self-correction can degrade performance if not carefully managed.

    A new research paper introduces a control-theoretic framework to analyze when iterative self-correction in large language models (LLMs) is beneficial or detrimental. The study proposes a diagnostic based on error correc…

  17. RESEARCH · CL_04946 ·

    New benchmarks and models push AI's ability to understand research papers and generate code

    Researchers have developed two new frameworks for chart-to-code generation, aiming to improve the accuracy and versatility of converting visual data into executable scripts. One approach, Chart2NCode, introduces a datas…

  18. RESEARCH · CL_03189 ·

    Yowch!: "Tsinghua University’s AGENTIF benchmark tested 707 instructions across 50 real-world agent scenarios. The best models followed fewer than 30% of instru

    New benchmarks reveal significant instruction-following deficits in leading AI models, with the AGENTIF benchmark showing top models adhering to fewer than 30% of instructions perfectly. This issue is exacerbated by the…

  19. RESEARCH · CL_02966 ·

    TaNOS framework boosts numerical reasoning in tables, outperforming GPT-5

    Researchers have developed TaNOS, a new framework designed to improve numerical reasoning in AI models when dealing with tabular data. This approach uses anonymized headers, operation sketches for structural cues, and s…

  20. RESEARCH · CL_14378 ·

    ARFBench benchmarks foundation models on software incident response TSQA

    Researchers have introduced ARFBench, a new benchmark designed to evaluate the time series question-answering capabilities of multimodal foundation models, particularly for software incident response. The benchmark comp…