New benchmark reveals enterprise LLM agents leak sensitive data

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

A new benchmark called CI-Work has been developed to assess the contextual integrity of enterprise LLM agents, focusing on their ability to handle sensitive information. Evaluations of current leading models show significant privacy failures, with violation rates between 15.8% and 50.9%. The research highlights a trade-off where improved task utility often leads to increased privacy risks, suggesting that current scaling approaches are insufficient for secure enterprise deployment. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Highlights critical privacy risks in enterprise LLM agents, necessitating new context-aware architectures for secure deployment.

RANK_REASON Academic paper introducing a new benchmark for LLM agents.

Read on arXiv cs.CL →

paper
safety

COVERAGE [1]

arXiv cs.CL TIER_1 · Dongmei Zhang · 2026-04-23 06:00

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

Enterprise LLM agents can dramatically improve workplace productivity, but their core capability, retrieving and using internal context to act on a user's behalf, also creates new risks for sensitive information leakage. We introduce CI-Work, a Contextual Integrity (CI)-grounded …

COVERAGE [1]

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

RELATED ENTITIES

RELATED TOPICS