WebARENA
PulseAugur coverage of WebARENA — every cluster mentioning WebARENA across labs, papers, and developer communities, ranked by signal.
-
cotomi Act agent learns to automate tasks by watching user behavior
Researchers have developed cotomi Act, a browser agent designed to automate work by learning from user actions. The system achieves high task execution accuracy on the WebArena benchmark, surpassing a human baseline. It…
-
AutoSurfer enhances web agent training with systematic exploration and task synthesis
Researchers have developed AutoSurfer, a novel system designed to generate comprehensive training data for web agents. This system employs a systematic breadth-first exploration strategy to thoroughly map website functi…
-
OpAgent achieves 71.6% success rate in web navigation tasks
Researchers have developed OpAgent, a novel web navigation agent that utilizes online reinforcement learning to overcome the limitations of static datasets. The agent employs a hierarchical multi-task fine-tuning approa…
-
AgentHER framework boosts LLM agent training with failed trajectory relabeling
Researchers have developed AgentHER, a new framework designed to improve the training of LLM agents by repurposing failed trajectories. The system adapts Hindsight Experience Replay to natural language, identifying alte…
-
OpenAI launches Operator, an AI agent that browses the web to perform tasks
OpenAI has launched Operator, a new AI agent designed to perform web-based tasks by interacting with websites through its own browser. This agent, powered by a new model called Computer-Using Agent (CUA), can fill forms…