PulseAugur
LIVE 10:38:24
research · [2 sources] ·
0
research

Researchers explore differential privacy for bandit problems and multi-agent learning

Two new research papers explore the application of differential privacy in bandit problems. The first paper introduces an algorithm for extensive-form bandit problems that achieves local differential privacy with a regret bound of \(\tilde{O}(\sqrt{A\ln(S)T}/\epsilon)\). The second paper proposes a fully distributed algorithm for max-min fair multi-agent bandits that preserves reward privacy while achieving a polynomial dependence on the number of agents and near-logarithmic dependence on the horizon. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT These papers advance the theoretical understanding of privacy in multi-agent reinforcement learning settings.

RANK_REASON Two arXiv papers present novel algorithms for privacy-preserving bandit problems.

Read on arXiv cs.LG →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 · Stephen Pasteris, Rahul Savani, Theodore Turocy ·

    Differential Privacy in the Extensive-Form Bandit Problem

    arXiv:2605.05266v1 Announce Type: cross Abstract: We consider the extensive-form bandit problem, where on each trial the learner (a user coordinated by a server) plays an extensive-form game against an oblivious adversary, observing the information sets it finds itself in as well…

  2. arXiv cs.LG TIER_1 · Amir Leshem ·

    Near-Optimal Privacy-Preserving Learning for Max-Min Fair Multi-Agent Bandits

    arXiv:2306.04498v3 Announce Type: replace Abstract: We study fair multi-agent multi-armed bandit learning under collision-only coordination. Agents cannot communicate explicitly during learning and observe only their own rewards and whether collisions occur when several agents ac…