Zyphra's ZAYA1-8B MoE model trained on AMD hardware outperforms larger rivals

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Zyphra AI has released ZAYA1-8B, a Mixture of Experts (MoE) language model with 760 million active parameters and 8.4 billion total parameters. Trained on AMD hardware, this model demonstrates competitive performance against larger models on math and coding benchmarks, utilizing innovations like Compressed Convolutional Attention and an MLP-based router. ZAYA1-8B is available under an Apache 2.0 license and as a serverless endpoint, offering efficient deployment for on-device applications and lower latency inference. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Offers a more efficient alternative for reasoning tasks, potentially lowering inference costs and enabling on-device LLM applications.

RANK_REASON Release of a new open-weight language model with novel architecture and training infrastructure. [lever_c_demoted from research: ic=1 ai=1.0]

Read on MarkTechPost →

Zyphra's ZAYA1-8B MoE model trained on AMD hardware outperforms larger rivals

COVERAGE [1]

MarkTechPost TIER_1 · Asif Razzaq · 2026-05-07 05:44

Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

<p>Zyphra releases ZAYA1-8B, a reasoning Mixture of Experts model with only 760M active parameters that outperforms open-weight models many times its size on math and coding benchmarks — closing in on DeepSeek-V3.2 and surpassing Claude 4.5 Sonnet on HMMT'25 with its novel Markov…

COVERAGE [1]

Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

RELATED ENTITIES

RELATED TOPICS