PulseAugur
LIVE 08:15:01
research · [2 sources] ·
0
research

Using large language models for embodied planning introduces systematic safety risks

A new benchmark called DESPITE has been developed to systematically evaluate the safety risks associated with using large language models for embodied planning in robotics. Research indicates that even models with high planning accuracy can exhibit significant safety failures, with safety awareness not scaling proportionally with model size. The findings highlight that improving safety awareness is a critical challenge for deploying LLM-based planners in real-world robotic systems. AI

Summary written by None from 2 sources. How we write summaries →

IMPACT Highlights critical safety challenges for LLM-based robotic planners, emphasizing the need for improved danger avoidance over mere planning ability.

RANK_REASON The cluster contains two arXiv papers discussing safety risks in AI, specifically concerning LLMs in embodied planning and a broader survey of safety in embodied AI.

Read on arXiv cs.LG →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 · Tao Zhang, Kaixian Qu, Zhibin Li, Jiajun Wu, Marco Hutter, Manling Li, Fan Shi ·

    Using large language models for embodied planning introduces systematic safety risks

    arXiv:2604.18463v2 Announce Type: replace-cross Abstract: Large language models are increasingly used as planners for robotic systems, yet how safely they plan remains an open question. To evaluate safe planning systematically, we introduce DESPITE, a benchmark of 12,279 tasks sp…

  2. arXiv cs.CV TIER_1 · Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia, Yixu Wang, Xin Wang, Ye Sun, Yunhan Zhao, Ming Wen, Jiayu Li, Xun Gong, Yi Liu, Yige Li, Yutao Wu, Cong Wang, Jun Sun, Yixin Cao, Zhineng Chen, Jingjing Chen, Tao Gui, Qi Zhang, Zuxuan Wu, Xipeng Qiu, Xuanjing ·

    Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

    arXiv:2605.02900v1 Announce Type: cross Abstract: Embodied Artificial Intelligence (Embodied AI) integrates perception, cognition, planning, and interaction into agents that operate in open-world, safety-critical environments. As these systems gain autonomy and enter domains such…