Opus 4.5
PulseAugur coverage of Opus 4.5 — every cluster mentioning Opus 4.5 across labs, papers, and developer communities, ranked by signal.
-
AI model evaluations need third-party auditors to ensure reliable progress tracking
Model evaluation methodologies are inconsistent across AI labs, leading to incomparable benchmark results and potentially flawed release decisions. Companies like OpenAI, Anthropic, and Google DeepMind have altered thei…
-
Xiaomi's MiMo-V2.5-Pro AI model challenges Claude Opus with superior efficiency
Xiaomi has released its MiMo v2.5 Pro, an open-weight AI model available under an MIT license. This new model demonstrates competitive performance, reportedly surpassing Claude Opus 4.5 in Arena scores. Notably, MiMo v2…
-
Users debate Claude Opus vs. Sonnet: Opus excels at complex tasks, Sonnet offers value
Users are discussing the perceived differences between Anthropic's Claude Opus and Sonnet models, with some finding Opus significantly more capable for complex tasks like debugging legacy code. One user reported Opus 4.…
-
Anthropic updates Claude models, Haiku 4.5 passes safety tests
Anthropic has updated its Claude Code product to allow users to select specific models, including Opus 4.7, Sonnet 4.6, and various 4.5 versions, through commands or environment variables. Separately, an evaluation of A…
-
ElevenLabs, Cerebras raise billions; Gemini 3 integrates widely, coding agents converge in IDEs
Several AI companies have achieved significant funding milestones, with ElevenLabs securing $500 million in Series D funding at an $11 billion valuation and Cerebras raising $1 billion in Series H at a $23 billion valua…
-
Andrej Karpathy uses Anthropic's Claude Opus 4.5 to auto-grade Hacker News discussions
Andrej Karpathy has developed a tool that uses an LLM to analyze historical Hacker News discussions from a decade ago. By feeding article content and comment threads into a model like Opus 4.5, the system can evaluate t…
-
Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H
Researchers have introduced A11y-Compressor, a framework designed to make GUI agent observations more efficient by transforming linearized accessibility trees into structured representations. This method reduces input t…