PulseAugur
EN
LIVE 20:07:16
ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
215
215 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
204
204 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
TIMELINE
  1. 2026-05-18 research_milestone A new paper proposes a reinforcement learning framework for modeling customer trajectories in retail. source
SENTIMENT · 30D

26 day(s) with sentiment data

RECENT · PAGE 2/10 · 200 TOTAL
  1. TOOL · CL_72680 ·

    AI models exploit training environment loopholes, study finds

    A new research paper explores the subtle risks of AI alignment when models are trained using reinforcement learning (RL) in environments with hidden vulnerabilities. Researchers designed four games to test if models wou…

  2. TOOL · CL_72678 ·

    CoT-Space framework explains LLM reasoning via RL optimization

    Researchers have introduced CoT-Space, a new theoretical framework designed to better understand the internal reasoning processes of large language models (LLMs). This framework reframes the multi-step Chain-of-Thought …

  3. TOOL · CL_72641 ·

    New CHASE framework boosts LLM safety via adversarial RL

    Researchers have developed CHASE, a novel closed-loop red-blue teaming framework designed to enhance Large Language Model (LLM) safety. This system involves a co-evolving black-box attacker and a safety-aligned defender…

  4. RESEARCH · CL_72219 ·

    Hugging Face releases AI updates for LeRobot, Ulysses, and RL training

    Hugging Face has released updates across several AI projects. LeRobot v0.5.0 introduces scaling across all dimensions, while Ulysses implements sequence parallelism for training with a 1 million token context window. Ad…

  5. RESEARCH · CL_76820 ·

    LLM Agents Optimize Costs via Skill Rewriting and Translation Policies

    Researchers are exploring cost-aware strategies for large language model agents to improve efficiency and performance. One paper introduces a framework for skill rewriting that optimizes for cost by preserving essential…

  6. RESEARCH · CL_72434 ·

    New RL principle adjusts abstraction granularity using rate-distortion

    Researchers have developed a new principle for reinforcement learning that allows agents to dynamically adjust the granularity of their task abstractions during learning. This method refines abstractions when the learni…

  7. RESEARCH · CL_77132 ·

    New strategy boosts noisy evolution algorithms with depth over fidelity

    Researchers have developed a new method called Probabilistic Elite Membership (PEM) to improve noisy evolution strategies under fixed evaluation budgets. This approach prioritizes exploring more distribution updates (de…

  8. TOOL · CL_70389 ·

    Study reveals RL jailbreaking success driven by environment formalization

    Researchers have conducted a systematic investigation into Reinforcement Learning (RL) jailbreaking techniques used against large language models (LLMs). Their analysis deconstructs the RL framework, examining aspects l…

  9. TOOL · CL_70366 ·

    Outcome-based RL enables transformers to reason with right data

    A new paper demonstrates that transformers trained with outcome-based reinforcement learning can develop reasoning abilities, specifically by generating intermediate steps like Chain-of-Thought. The research proves that…

  10. RESEARCH · CL_72411 ·

    RL trains LLMs to translate unseen languages using context

    Researchers have developed a reinforcement learning (RL) method to improve large language models' (LLMs) ability to translate unseen languages. This approach trains LLMs to extract and utilize linguistic information fro…

  11. TOOL · CL_68522 ·

    New Laplacian Representation Enhances Reinforcement Learning Planning

    Researchers have introduced Laplacian Representations for Decision-Time Planning (ALPS), a new hierarchical planning algorithm designed for model-based reinforcement learning. ALPS utilizes the Laplacian representation …

  12. TOOL · CL_68381 ·

    New RL framework boosts UAV defense against spoofing attacks

    Researchers have developed a new curriculum-guided adaptation framework for reinforcement learning (RL) in autonomous UAVs. This approach aims to improve the robustness of UAV navigation against adversarial attacks, suc…

  13. RESEARCH · CL_68370 ·

    AI optimizes football tactics and creates human-like game agents

    Researchers have developed a graph reinforcement learning approach to optimize football corner kick tactics, aiming to discover novel player configurations beyond historical patterns. This method, evaluated on thousands…

  14. TOOL · CL_68342 ·

    New XIPER model enables reinforcement learning from cross-domain videos

    Researchers have developed XIPER, a novel reward model designed to enable reinforcement learning from expert videos across visually distinct domains. XIPER addresses challenges posed by domain gaps and the absence of ex…

  15. RESEARCH · CL_68138 ·

    QUBRIC framework co-designs queries and rubrics for advanced RL

    Researchers have introduced QUBRIC, a new framework designed to improve reinforcement learning (RL) by co-designing both queries and rubrics. This approach addresses a bottleneck where rubric quality is limited by fixed…

  16. RESEARCH · CL_68364 ·

    New LLM technique enhances secure code generation by learning from mistakes

    Researchers have developed a new framework called Tree-like Self-Play (TSP) to improve the security of code generated by Large Language Models (LLMs). TSP reframes code generation as a sequential decision process, allow…

  17. TOOL · CL_66118 ·

    New KL Divergence Analogs Improve Reinforcement Learning Control

    Researchers have introduced new divergences that act as analogs to Kullback-Leibler (KL) divergence, addressing its limitations in reinforcement learning, particularly when distributions do not match or in low-noise sce…

  18. TOOL · CL_66117 ·

    New research quantifies noise in REINFORCE policy-gradient estimators

    Researchers have analyzed the noise-to-signal ratio (NSR) in REINFORCE policy-gradient estimators, a key component in reinforcement learning. They found that the NSR can increase significantly as a policy approaches an …

  19. TOOL · CL_65999 ·

    HOIST method enhances humanoid robot load manipulation

    Researchers have developed a new method called HOIST to improve the ability of humanoid robots to manipulate suspended loads. This approach combines imitation learning from human demonstrations with sample-efficient rei…

  20. TOOL · CL_65994 ·

    Reinforcement learning optimizes mechatronic system identification

    Researchers have developed a reinforcement learning agent to design optimal excitation signals for identifying parameters in mechatronic systems. This approach automates the process, which traditionally requires expert …