ENTITY Markov decision process

Markov decision process

PulseAugur coverage of Markov decision process — every cluster mentioning Markov decision process across labs, papers, and developer communities, ranked by signal.

Total · 30d

22 over 90d

Releases · 30d

0 over 90d

Papers · 30d

22 over 90d

TIER MIX · 90D

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 15 TOTAL

TOOL · CL_28280 · May 11 · 16:32

New Q-value iteration analysis uses switching geometry

This paper introduces a new framework for analyzing Q-value iteration in Markov decision processes, focusing on a technique called rank-one deflation. The authors interpret the algorithm's behavior through the geometry …
TOOL · CL_22062 · May 8 · 04:00

New protocol optimizes drug trial subsidies to boost social utility

Researchers have developed a new statistical protocol for sequential experimentation that aims to optimize social utility in high-stakes domains like drug development. This protocol involves a product developer conducti…
RESEARCH · CL_21752 · May 7 · 16:00

Q-MMR framework offers novel approach to off-policy evaluation

Researchers have introduced Q-MMR, a new theoretical framework for off-policy evaluation in Markov Decision Processes (MDPs). This method learns weights for data points to approximate expected returns under a target pol…
TOOL · CL_18574 · May 6 · 04:00

Reinforcement learning enhances autonomous target tracking accuracy and robustness

Researchers have developed a deep reinforcement learning approach for autonomous bearings-only tracking of moving targets. The system formulates the observer maneuver problem as a belief Markov decision process, using a…
TOOL · CL_18831 · May 6 · 04:00

Reinforcement learning uses symmetry and data augmentation for faster aircraft control

Researchers have developed a new method for offline reinforcement learning that leverages the symmetry of dynamical systems to improve sample efficiency. This approach uses symmetric data augmentation to enhance the sta…
TOOL · CL_16024 · May 5 · 04:00

New metric-normalized posterior leakage (mPL) enhances privacy for joint AI consumption

Researchers have developed a new privacy metric called Metric-Normalized Posterior Leakage (mPL) to address limitations in existing differential privacy methods, particularly for machine learning systems used under join…
RESEARCH · CL_16067 · May 5 · 04:00

New research advances adversarial imitation learning theory and practice

Two new papers explore the theoretical underpinnings of adversarial imitation learning (AIL), a technique that uses neural networks to learn from expert demonstrations. The first paper introduces OPT-AIL, a framework de…
TOOL · CL_16235 · May 5 · 04:00

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Researchers have developed a new framework called RAST-MoE-RL to improve efficiency in ride-hailing services. This framework utilizes a Mixture-of-Experts (MoE) approach within deep reinforcement learning to better hand…
RESEARCH · CL_14397 · May 4 · 04:00

Researchers find random data deletion improves adaptive RL policies

Researchers have discovered that randomly deleting a portion of training data can significantly improve the performance of adaptive reinforcement learning policies. This counterintuitive technique helps by implicitly do…
RESEARCH · CL_14217 · May 1 · 06:43

DRL framework optimizes NR-U/Wi-Fi coexistence for fairness and throughput

Researchers have developed a policy-driven deep reinforcement learning framework to manage resource allocation between NR-U and Wi-Fi networks operating in unlicensed spectrum. This framework uses a deep Q-network to le…
RESEARCH · CL_11893 · May 1 · 04:00

AutoREC platform uses RL agents to generate circuit models from EIS data

Researchers have developed AutoREC, an open-source Python package designed to automate the generation of equivalent circuit models (ECMs) from electrochemical impedance spectroscopy (EIS) data. This platform utilizes re…
COMMENTARY · CL_11269 · Apr 28 · 23:38

Yann LeCun clarifies technical definition of 'world models' in AI

Yann LeCun shared a technical discussion regarding the term "world models" in AI. He clarified that in control theory and the context of Markov Decision Processes (MDPs), "world models" specifically refers to transition…
RESEARCH · CL_07013 · Apr 28 · 04:00

AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation

Researchers have developed AsyncShield, a new framework designed to improve the navigation capabilities of Vision-Language-Action (VLA) models on mobile robots. This system addresses the latency and network jitter issue…
RESEARCH · CL_05164 · Apr 27 · 04:00

New algorithm identifies near-optimal policies in robust constrained Markov decision processes

Researchers have developed a novel algorithm to identify near-optimal policies in robust constrained Markov decision processes (RCMDPs). This new method addresses limitations in existing policy gradient approaches that …
RESEARCH · CL_05085 · Apr 24 · 02:36

Researchers develop MDP and POMDP for error mitigation in digital twins

Researchers have developed a new framework for mitigating error propagation in modular digital twins by treating it as a sequential decision-making problem. They formulated this using a Markov Decision Process (MDP) and…

New Q-value iteration analysis uses switching geometry

New protocol optimizes drug trial subsidies to boost social utility

Q-MMR framework offers novel approach to off-policy evaluation

Reinforcement learning enhances autonomous target tracking accuracy and robustness

Reinforcement learning uses symmetry and data augmentation for faster aircraft control

New metric-normalized posterior leakage (mPL) enhances privacy for joint AI consumption

New research advances adversarial imitation learning theory and practice

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Researchers find random data deletion improves adaptive RL policies

DRL framework optimizes NR-U/Wi-Fi coexistence for fairness and throughput

AutoREC platform uses RL agents to generate circuit models from EIS data

Yann LeCun clarifies technical definition of 'world models' in AI

AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation

New algorithm identifies near-optimal policies in robust constrained Markov decision processes

Researchers develop MDP and POMDP for error mitigation in digital twins