Researchers have developed GLiNER2-PII, a compact 0.3 billion parameter model designed for multilingual personally identifiable information (PII) extraction. This model, adapted from GLiNER2, can identify 42 different types of PII at the character-span level. To overcome data scarcity and privacy concerns, a synthetic multilingual corpus was created using a constraint-driven generation pipeline. GLiNER2-PII demonstrated superior performance on the SPY benchmark compared to other systems, including OpenAI's Privacy Filter, and has been released on Hugging Face. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT This new model offers improved multilingual PII detection, potentially enhancing data privacy and security in various applications.
RANK_REASON The cluster describes a new research paper detailing a novel model for PII extraction, including its methodology, performance, and public release. [lever_c_demoted from research: ic=1 ai=1.0]