New FQE and FQI methods bypass Bellman completeness for stability

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

Researchers have developed new methods for Fitted Q-Evaluation (FQE) and soft Fitted Q-Iteration (soft FQI) that do not require Bellman completeness, a condition often unmet with function approximation. The proposed techniques, stationary-weighted FQE and stationary-reweighted soft FQI, address instability issues by reweighting regression steps to align with the target policy's stationary distribution. These approaches aim to improve stability and reduce value error in off-policy evaluation for reinforcement learning. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Enhances theoretical foundations for off-policy evaluation in reinforcement learning, potentially improving model training and decision-making in complex environments.

RANK_REASON Two arXiv papers introduce novel theoretical methods for reinforcement learning evaluation.

Read on arXiv stat.ML →

paper
other

COVERAGE [2]

arXiv stat.ML TIER_1 · Lars van der Laan, Nathan Kallus · 2026-05-11 04:00

Fitted $Q$ Evaluation Without Bellman Completeness via Stationary Weighting

arXiv:2512.23805v3 Announce Type: replace Abstract: Fitted $Q$-evaluation (FQE) is a standard regression-based tool for off-policy evaluation, but existing stability guarantees often rely on Bellman completeness, a strong closure condition that can fail under function approximati…
arXiv stat.ML TIER_1 · Lars van der Laan, Nathan Kallus · 2026-05-11 04:00

Stationary Reweighting Yields Local Convergence of Soft Fitted Q-Iteration

arXiv:2512.23927v2 Announce Type: replace Abstract: Fitted $Q$-iteration (FQI) and soft FQI are widely used value-based methods for offline reinforcement learning, but their standard stability guarantees often depend on Bellman completeness, a strong closure condition that can fa…

COVERAGE [2]

Fitted $Q$ Evaluation Without Bellman Completeness via Stationary Weighting

Stationary Reweighting Yields Local Convergence of Soft Fitted Q-Iteration

RELATED ENTITIES

RELATED TOPICS