ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Total · 30d

116

116 over 90d

Releases · 30d

0 over 90d

Papers · 30d

112

112 over 90d

TIER MIX · 90D

significant 2
research 41
tool 71
commentary 1
meme 1

RELATIONSHIPS

SENTIMENT · 30D

6 day(s) with sentiment data

RECENT · PAGE 1/5 · 91 TOTAL

TOOL · CL_29601 · May 13 · 04:50

CognitiveBotics builds personalized AI content engine for autistic children

CognitiveBotics has developed a personalized content engine for children with autism, addressing the challenge of high individual variability in learning preferences. Their Modalities Engine renders learning objectives …
TOOL · CL_29441 · May 12 · 17:23

AI finds new record graphs with many geometric realizations

Researchers have developed a reinforcement-learning method to construct minimally rigid graphs with a high number of realizations. This approach uses Henneberg moves and optimizes realization-count invariants with a pol…
TOOL · CL_29442 · May 12 · 17:12

New flow map policies accelerate generative AI for robotics

Researchers have developed a new class of generative policies called flow map policies, designed to accelerate action generation in complex control problems. These policies learn to make large jumps within generative dy…
TOOL · CL_29374 · May 12 · 16:34

QAP-Router uses RL to optimize quantum qubit routing

Researchers have developed QAP-Router, a novel reinforcement learning approach for quantum compilation that frames qubit routing as a dynamic Quadratic Assignment Problem. This method models quantum gate interactions an…
TOOL · CL_29381 · May 12 · 16:16

RAW-Dream enables zero-shot VLA adaptation via task-agnostic world models

Researchers have introduced RAW-Dream, a novel approach to adapt Vision-Language-Action (VLA) models for new tasks using reinforcement learning within task-agnostic world models. This method disentangles world model lea…
TOOL · CL_28659 · May 12 · 15:10

Reinforcement learning rewards: Designing agent behavior and avoiding loopholes

This article delves into the critical role of reward functions in reinforcement learning, explaining how their design directly influences an agent's behavior. It highlights that improperly defined reward functions can l…
TOOL · CL_28331 · May 11 · 17:49

Reinforcement learning agent synthesizes Clifford quantum circuits efficiently

Researchers have developed a novel reinforcement learning approach for synthesizing Clifford quantum circuits. Their method utilizes a size-agnostic, equivariant neural network that learns to discover optimal sequences …
TOOL · CL_28282 · May 11 · 16:30

AI tools enhance campus well-being via chatbots and mental health detection

Researchers have developed AI tools to improve campus well-being by enhancing feedback collection and mental health detection. TigerGPT, a chatbot, uses LLMs for personalized surveys, achieving high usability and satisf…
RESEARCH · CL_26359 · May 11 · 10:12

GPT-5 Mini leads Agentick benchmark, but no agent paradigm dominates

The new Agentick benchmark, which assesses various AI agents across 37 tasks, shows GPT-5 Mini achieving the top score of 0.309. However, no single agent paradigm, including reinforcement learning, LLM, VLM, or hybrid a…
RESEARCH · CL_27508 · May 11 · 08:28

MTA-RL framework enhances urban driving with multi-modal AI

Researchers have developed MTA-RL, a novel framework that integrates multi-modal transformer-based 3D affordances with reinforcement learning for robust urban autonomous driving. This approach fuses RGB images and LiDAR…
TOOL · CL_27531 · May 11 · 06:14

New RL algorithm adaptively chunks actions for better learning

Researchers have introduced Adaptive Action Chunking (ACH), a new algorithm for reinforcement learning that dynamically adjusts the length of action sequences. Unlike previous methods that used fixed chunk lengths, ACH …
RESEARCH · CL_25979 · May 11 · 04:00

New FQE and FQI methods bypass Bellman completeness for stability

Researchers have developed new methods for Fitted Q-Evaluation (FQE) and soft Fitted Q-Iteration (soft FQI) that do not require Bellman completeness, a condition often unmet with function approximation. The proposed tec…
TOOL · CL_25358 · May 10 · 19:59

Robotics hobbyist demonstrates AI balance bot using reinforcement learning

A robotics enthusiast has developed an AI-powered balance bot, demonstrating the potential of reinforcement learning in control systems. The initial iteration required significant adjustments, highlighting the challenge…
TOOL · CL_25531 · May 8 · 17:07

Frontier LRMs match human game learning and brain activity

A new research paper explores how frontier Large Reasoning Models (LRMs) compare to human learning in complex game environments. The study used gameplay data and fMRI recordings to evaluate LRMs against various AI agent…
TOOL · CL_25553 · May 8 · 15:04

New DTSemNet method trains oblique decision trees without approximations

Researchers have developed DTSemNet, a new method for training oblique decision trees without approximations. This approach uses a semantically equivalent and invertible neural network representation, allowing for end-t…
TOOL · CL_25622 · May 8 · 12:05

New LC-MAPF model enhances multi-agent pathfinding with local communication

Researchers have developed a new machine learning model called LC-MAPF designed to improve coordination in large-scale multi-agent pathfinding scenarios. This model incorporates a learnable communication module that all…
TOOL · CL_25661 · May 8 · 06:34

New method slashes RL weight sync communication by 100x

Researchers have developed SparseRL-Sync, a novel method for synchronizing policy weights in large-scale reinforcement learning systems. This technique leverages the inherent sparsity of parameter changes during trainin…
RESEARCH · CL_21952 · May 8 · 04:00

New methods enhance on-policy distillation for LLMs

Researchers have developed new methods to improve the efficiency and stability of on-policy distillation (OPD) for large language models. One approach, vOPD, uses a control variate baseline derived from the reverse KL d…
TOOL · CL_22473 · May 8 · 04:00

New Long-Horizon Q-Learning method improves reinforcement learning accuracy

Researchers have introduced Long-Horizon Q-Learning (LQL), a novel method designed to improve the stability of value-based reinforcement learning. LQL addresses the issue of compounding estimation errors in traditional …
TOOL · CL_22097 · May 8 · 04:00

PlatoLTL enables RL agents to generalize across unseen symbols in LTL instructions

Researchers have introduced PlatoLTL, a new method designed to improve generalization in multi-task reinforcement learning. This approach enables RL agents to perform tasks not encountered during training, specifically …

CognitiveBotics builds personalized AI content engine for autistic children

AI finds new record graphs with many geometric realizations

New flow map policies accelerate generative AI for robotics

QAP-Router uses RL to optimize quantum qubit routing

RAW-Dream enables zero-shot VLA adaptation via task-agnostic world models

Reinforcement learning rewards: Designing agent behavior and avoiding loopholes

Reinforcement learning agent synthesizes Clifford quantum circuits efficiently

AI tools enhance campus well-being via chatbots and mental health detection

GPT-5 Mini leads Agentick benchmark, but no agent paradigm dominates

MTA-RL framework enhances urban driving with multi-modal AI

New RL algorithm adaptively chunks actions for better learning

New FQE and FQI methods bypass Bellman completeness for stability

Robotics hobbyist demonstrates AI balance bot using reinforcement learning

Frontier LRMs match human game learning and brain activity

New DTSemNet method trains oblique decision trees without approximations

New LC-MAPF model enhances multi-agent pathfinding with local communication

New method slashes RL weight sync communication by 100x

New methods enhance on-policy distillation for LLMs

New Long-Horizon Q-Learning method improves reinforcement learning accuracy

PlatoLTL enables RL agents to generalize across unseen symbols in LTL instructions