PulseAugur
LIVE 09:44:25
research · [11 sources] ·
0
research

Anthropic's Claude Opus 4.7 shows reduced sycophancy but faces subagent refusals

Anthropic has released findings on Claude's sycophancy, particularly in relationship guidance conversations, where Opus 4.7 showed a reduced rate compared to Opus 4.6. The company also detailed how users seek personal guidance from Claude, with a significant portion of these conversations falling into health, career, relationships, and finance domains. Separately, a bug in Claude's system prompt for CLI tools is causing subagents to refuse legitimate code editing tasks, leading to a high task rejection rate. AI

Summary written by gemini-2.5-flash-lite from 11 sources. How we write summaries →

IMPACT New research highlights areas for improvement in AI conversational behavior, while a CLI bug impacts developer productivity.

RANK_REASON The cluster contains research findings on model behavior and a bug report for a CLI tool.

Read on Mastodon — mastodon.social →

COVERAGE [11]

  1. X — Anthropic TIER_1 · AnthropicAI ·

    All data in this study was collected and analyzed using our privacy-preserving tool.

    All data in this study was collected and analyzed using our privacy-preserving tool. Read more: https://t.co/X82ttb7f4b

  2. X — Anthropic TIER_1 · AnthropicAI ·

    This work is part of a loop we're working to close between societal impacts and model training. One of our goals is to study how people use Claude, find where i

    This work is part of a loop we're working to close between societal impacts and model training. One of our goals is to study how people use Claude, find where it falls short of its principles, and use what we learned in training new models. Read more: https://t.co/6tjY58uBhk

  3. X — Anthropic TIER_1 · AnthropicAI ·

    When stress-tested on real conversations where Claude previously showed sycophancy, Opus 4.7 had half the sycophancy rate of Opus 4.6 on relationship guidance.

    When stress-tested on real conversations where Claude previously showed sycophancy, Opus 4.7 had half the sycophancy rate of Opus 4.6 on relationship guidance. Mythos Preview cut that in half again. This generalized across domains—though this training is one of several causes. …

  4. X — Anthropic TIER_1 · AnthropicAI ·

    Claude is most sycophantic under pushback, and relationship conversations are where people push back most.

    Claude is most sycophantic under pushback, and relationship conversations are where people push back most. We identified some of the specific triggers—criticism of Claude's analysis, floods of one-sided detail—and built synthetic training scenarios from them.

  5. X — Anthropic TIER_1 · AnthropicAI ·

    Claude mostly avoids sycophancy when giving guidance—it shows up in just 9% of conversations.

    Claude mostly avoids sycophancy when giving guidance—it shows up in just 9% of conversations. But the rate is particularly high in conversations on spirituality and relationship guidance. https://t.co/mgix5ejTZw

  6. X — Anthropic TIER_1 · AnthropicAI ·

    About 6% of all conversations are people asking Claude for personal guidance—whether to take a job, how to handle a conflict, if they should move.

    About 6% of all conversations are people asking Claude for personal guidance—whether to take a job, how to handle a conflict, if they should move. Over 75% of these conversations fell into four domains: health & wellness, career, relationships, and personal finance. https://…

  7. X — Anthropic TIER_1 · AnthropicAI ·

    We focused on relationship guidance because that's where the most sycophantic conversations occur. In this setting, Claude telling someone what they want to hea

    We focused on relationship guidance because that's where the most sycophantic conversations occur. In this setting, Claude telling someone what they want to hear can harden a divide or convince them a signal means more than it does.

  8. X — Anthropic TIER_1 · AnthropicAI ·

    How do people seek guidance from Claude?

    How do people seek guidance from Claude? We looked at 1M conversations to understand what questions people ask, how Claude responds, and where it slips into sycophancy. We used what we found to improve how we trained Opus 4.7 and Mythos Preview. https://t.co/6tjY58uBhk

  9. HN — claude-code stories TIER_1 · thomashobohm ·

    Regression: malware reminder on every read still causes subagent refusals

  10. Mastodon — fosstodon.org TIER_1 中文(ZH) · [email protected] ·

    [Bug] Regression: 'Malware alert' in Claude v2.1.111 causes sub-agents to refuse task execution ➤ How semantic flaws in system prompts kill AI productivity ✤ https://github.com/anthropics/claude-code/issues/49363 This report points out that the Claude CLI tool in v2.

    🌗 [Bug] 回歸錯誤:Claude v2.1.111 版本中「惡意軟體提醒」導致子代理拒絕執行任務 ➤ 系統提示詞的語義缺陷如何扼殺 AI 的生產力 ✤ https:// github.com/anthropics/claude-c ode/issues/49363 此報告指出 Claude CLI 工具在 v2.1.111 版本中出現回歸錯誤。儘管官方曾聲稱已在早期版本修復此問題,但系統仍會在執行「讀取 (Read)」或「Grep」操作時,強制注入一段關於「惡意軟體」的警示訊息。由於該警示的措辭存在語義歧義,導致 Claude 的子代理(Subag…

  11. Mastodon — mastodon.social TIER_1 · [email protected] ·

    Claude system prompt bug wastes user money and bricks managed agents https://github.com/anthropics/claude-code/issues/49363 # HackerNews # Tech # AI

    Claude system prompt bug wastes user money and bricks managed agents https://github.com/anthropics/claude-code/issues/49363 # HackerNews # Tech # AI