PulseAugur
LIVE 01:28:56
research · [2 sources] ·
1
research

New FQE and FQI methods bypass Bellman completeness for stability

Researchers have developed new methods for Fitted Q-Evaluation (FQE) and soft Fitted Q-Iteration (soft FQI) that do not require Bellman completeness, a condition often unmet with function approximation. The proposed techniques, stationary-weighted FQE and stationary-reweighted soft FQI, address instability issues by reweighting regression steps to align with the target policy's stationary distribution. These approaches aim to improve stability and reduce value error in off-policy evaluation for reinforcement learning. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Enhances theoretical foundations for off-policy evaluation in reinforcement learning, potentially improving model training and decision-making in complex environments.

RANK_REASON Two arXiv papers introduce novel theoretical methods for reinforcement learning evaluation.

Read on arXiv stat.ML →

COVERAGE [2]

  1. arXiv stat.ML TIER_1 · Lars van der Laan, Nathan Kallus ·

    Fitted $Q$ Evaluation Without Bellman Completeness via Stationary Weighting

    arXiv:2512.23805v3 Announce Type: replace Abstract: Fitted $Q$-evaluation (FQE) is a standard regression-based tool for off-policy evaluation, but existing stability guarantees often rely on Bellman completeness, a strong closure condition that can fail under function approximati…

  2. arXiv stat.ML TIER_1 · Lars van der Laan, Nathan Kallus ·

    Stationary Reweighting Yields Local Convergence of Soft Fitted Q-Iteration

    arXiv:2512.23927v2 Announce Type: replace Abstract: Fitted $Q$-iteration (FQI) and soft FQI are widely used value-based methods for offline reinforcement learning, but their standard stability guarantees often depend on Bellman completeness, a strong closure condition that can fa…