A new benchmark called CI-Work has been developed to assess the contextual integrity of enterprise LLM agents, focusing on their ability to handle sensitive information. Evaluations of current leading models show significant privacy failures, with violation rates between 15.8% and 50.9%. The research highlights a trade-off where improved task utility often leads to increased privacy risks, suggesting that current scaling approaches are insufficient for secure enterprise deployment. AI
影响 Highlights critical privacy risks in enterprise LLM agents, necessitating new context-aware architectures for secure deployment.
排序理由 Academic paper introducing a new benchmark for LLM agents.
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →