Claude Sonnet 4.5
PulseAugur coverage of Claude Sonnet 4.5 — every cluster mentioning Claude Sonnet 4.5 across labs, papers, and developer communities, ranked by signal.
- 2026-05-12 product_launch Claude Sonnet 4.5 is being retired from the claude.ai model selector. source
7 day(s) with sentiment data
-
Advanced AI Models GPT-4o, Claude 3.5 Show Systematic Thinking Errors
New analysis indicates that advanced AI models like GPT-4o and Claude 3.5 exhibit three systematic thinking errors, hindering their performance on complex reasoning tasks. These flaws highlight a fundamental gap in mach…
-
LLMs show internal emotion concepts; limit agent self-critique loops to two iterations
A recent paper from Anthropic explores how large language models, specifically Claude Sonnet 4.5, develop internal representations of emotion concepts. These representations allow the models to generalize and track oper…
-
Mistral releases Mistral Medium 3.5, a powerful new AI model
Mistral AI has released its new Mistral Medium 3.5 model, which is being praised for its performance. Early indications suggest its capabilities are on par with Anthropic's Sonnet 4.5 model. This release highlights adva…
-
Anthropic's Claude AI integrates with Adobe, expands creative tool connectors
Anthropic has released connectors for Claude, enabling direct integration with tools like Adobe and Blender, and has also announced a partnership with Ableton, Canva, Autodesk, and others. Separately, Mistral has releas…
-
LLM theorem generation falls short on semantic correctness, new benchmark reveals
Researchers have developed a new framework called T to evaluate the semantic correctness of theorems generated by large language models in automated theorem proving. This approach, inspired by code generation testing, v…
-
AeSlides framework uses verifiable rewards to improve LLM slide generation aesthetics
Researchers have introduced AeSlides, a novel reinforcement learning framework designed to improve the aesthetic quality of slides generated by large language models. This system utilizes verifiable metrics to quantify …
-
Researchers probe VLM safety with embedding-guided typographic attacks
Researchers have developed a method to probe the safety vulnerabilities of vision-language models (VLMs) by using typographic prompt injections. Their study found that multimodal embedding distance strongly predicts att…
-
New research probes LLM reasoning and reveals novel jailbreaking vulnerabilities
Researchers have developed a new method to jailbreak large language models by exploiting their safe completion mechanisms through deceptive multi-turn conversations. This technique, termed intention deception, gradually…
-
Meta plans $25B bond offering as US economy shows mixed signals
DeepSeek has released its V4 model, featuring a 1.6 trillion parameter version and a 1 million token context window, optimized for Huawei's Ascend AI chips. This move marks a significant shift away from Nvidia hardware,…
-
Bankers find AI-generated reports unusable, while software engineers embrace coding agents in 2026
A recent benchmark involving 500 investment bankers found that AI-generated client reports are unusable for professional engagement in the banking sector. Models such as GPT-5.4 and Claude Opus 4.6 produced reports that…
-
AI models show Western bias, homogenizing values across cultures
A new study auditing large language models found that three leading systems—Claude Sonnet 4.5, GPT-5.4, and Gemini 2.5 Flash—consistently provided individualistic advice, even when presented with dilemmas from users in …
-
Anthropic updates Claude models, Haiku 4.5 passes safety tests
Anthropic has updated its Claude Code product to allow users to select specific models, including Opus 4.7, Sonnet 4.6, and various 4.5 versions, through commands or environment variables. Separately, an evaluation of A…
-
New metrics quantify LLM agent behavioral similarity and convergence
A new paper introduces two metrics, Response Pattern Similarity (RPS) and Action Graph Similarity (AGS), to quantify how similar the tool-use behaviors of different AI agents are. These metrics aim to distinguish betwee…
-
Anthropic ends model version pinning, users report Sonnet 4.6 style issues
Anthropic is phasing out specific model version pinning for its Claude Sonnet models, forcing users to adopt the latest version, Sonnet 4.6. This change has led to user frustration as client applications may break with …
-
LLMs show emotional representations and susceptibility to false beliefs
A new paper from Anthropic's interpretability team reveals that their Claude Sonnet 4.5 model develops internal representations that emulate human emotions, influencing its behavior and decision-making. These "functiona…
-
Most AI models fail simple 'car wash' reasoning test, Opper finds
A new benchmark called the "Car Wash Test" reveals that many leading AI models struggle with basic reasoning. When asked whether to walk or drive 50 meters to a car wash, 42 out of 53 tested models incorrectly suggested…