New N-Gram attack probes black-box LLMs for training data leakage

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed a new membership inference attack called N-Gram Coverage Attack, which can be used on black-box language models like GPT-4 by only analyzing their text outputs. This method leverages the observation that models tend to memorize and regenerate text patterns from their training data. The attack demonstrates strong performance, even rivaling white-box methods, and its effectiveness increases with more generated sequences. Notably, the study found that newer models like GPT-4o show improved resistance to such attacks, indicating enhanced privacy measures. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT New black-box attack method could challenge privacy protections in API-only models, though newer models show improved robustness.

RANK_REASON The cluster contains an academic paper detailing a new method for membership inference attacks on language models.

Read on arXiv cs.CL →

paper
safety

COVERAGE [1]

arXiv cs.CL TIER_1 · Skyler Hallinan, Jaehun Jung, Melanie Sclar, Ximing Lu, Abhilasha Ravichander, Sahana Ramnath, Yejin Choi, Sai Praneeth Karimireddy, Niloofar Mireshghallah, Xiang Ren · 2026-04-28 04:00

The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

arXiv:2508.09603v2 Announce Type: replace Abstract: Membership inference attacks serves as useful tool for fair use of language models, such as detecting potential copyright infringement and auditing data leakage. However, many current state-of-the-art attacks require access to m…

COVERAGE [1]

The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

RELATED ENTITIES

RELATED TOPICS