Using Language Models as Closed-Loop High-Level Planners for Robotics Applications: A Brief Overview and…
ByPulseAugur Editorial·
Summary by None
from 14 sources
Researchers have developed a novel framework for measuring student engagement using vision-language models (VLMs) and large language models (LLMs). This approach adapts VLMs for action recognition with limited data and uses LLMs to classify sequences of actions, considering peer context. Separately, new research explores using LLMs as closed-loop planners for robots, investigating strategies to improve their reliability and reduce errors in embodied planning tasks. Another study introduces an LLM-driven framework for robots to autonomously learn and adapt to new tasks in open environments, reducing reliance on repeated LLM interactions.
AI
arXiv:2605.01846v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to generate multiple-choice questions (MCQs), where correct answers should ideally be uniformly distributed across options. However, we observe that LLMs exhibit systematic position…
Large language models (LLMs) are increasingly used to generate multiple-choice questions (MCQs), where correct answers should ideally be uniformly distributed across options. However, we observe that LLMs exhibit systematic position biases during generation. Through extensive exp…
arXiv cs.AI
TIER_1·Ahmed Abdelkawy, Ahmed Elsayed, Asem Ali, Aly Farag, Thomas Tretter, Michael McIntyre·
arXiv:2601.06394v4 Announce Type: replace-cross Abstract: Understanding student behavior in the classroom is essential to improve both pedagogical quality and student engagement. Existing methods for predicting student engagement typically require substantial annotated data to mo…
arXiv:2511.07410v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) and Vision Language Models (VLMs) have become popular tools for embodied high-level planning. However, their deployment in black-box settings often leads to unpredictable or costly errors. To h…
arXiv:2604.22199v1 Announce Type: cross Abstract: Autonomous robots operating in open environments need the ability to continuously handle tasks that are not covered by predefined local methods. However, existing approaches often rely on repeated large-language-model (LLM) intera…
Autonomous robots operating in open environments need the ability to continuously handle tasks that are not covered by predefined local methods. However, existing approaches often rely on repeated large-language-model (LLM) interaction for uncovered tasks, and even successful exe…
arXiv:2604.28095v1 Announce Type: new Abstract: Accurate lesion segmentation is crucial for clinical diagnosis and treatment planning. However, lesions often resemble surrounding tissues and exhibit ill-defined boundaries, leading to unstable predictions in boundary/transition re…
Accurate lesion segmentation is crucial for clinical diagnosis and treatment planning. However, lesions often resemble surrounding tissues and exhibit ill-defined boundaries, leading to unstable predictions in boundary/transition regions. Moreover, small-lesion cues can be dilute…
arXiv:2604.25310v1 Announce Type: new Abstract: This work addresses the critical problem of tracking fast-moving objects through strongly scattering media in a low-light environment. Different from existing approaches that use frame-based cameras with fixed exposure times, which …
This work addresses the critical problem of tracking fast-moving objects through strongly scattering media in a low-light environment. Different from existing approaches that use frame-based cameras with fixed exposure times, which trade off signal-to-noise ratio for temporal res…
arXiv:2604.23788v1 Announce Type: new Abstract: Appreciating multi-figure paintings requires understanding how characters relate through subtle cues like gaze alignment, gesture, and spatial arrangement. We present MIRAGE, an evidence-centric framework designed to scaffold the ex…
arXiv:2604.23688v1 Announce Type: new Abstract: Proactive defense methods protect portrait images from unauthorized editing or talking face generation (TFG) by introducing pixel-level protective perturbations, and have already attracted increasing attention for privacy protection…