ENTITY Sparse Mixture of Experts

Sparse Mixture of Experts

PulseAugur coverage of Sparse Mixture of Experts — every cluster mentioning Sparse Mixture of Experts across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

4 over 90d

Releases · 30d

0 over 90d

Papers · 30d

4 over 90d

TIER MIX · 90D

frontier release 1
research 2
tool 1

TOPICS

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

TOOL · CL_62933 · Jun 1 · 04:00

New SSMoE framework uses eigenvectors to fix SMoE model collapse

Researchers have introduced Singular Value Decomposition SMoE (SSMoE), a new framework designed to tackle the expert collapse issue in Sparse Mixture of Experts (SMoE) models. Unlike previous methods that require extens…
FRONTIER RELEASE · CL_62639 · May 30 · 00:00

JetBrains releases efficient Mellum2 MoE model; research advances MoE techniques

JetBrains has released Mellum2, an open-source 12-billion parameter Mixture-of-Experts (MoE) model optimized for efficient inference in text and code tasks. This model activates only a fraction of its parameters per tok…
RESEARCH · CL_48816 · May 25 · 04:00

LLMs explore preference alignment and failure mitigation techniques

Researchers are exploring new methods for aligning large language models (LLMs) with human preferences and mitigating specific failure modes. One approach uses Direct Preference Optimization (DPO) to reduce text degener…
RESEARCH · CL_28307 · May 11 · 17:58

New research optimizes Sparse Mixture-of-Experts for efficient LLM scaling

Researchers are exploring new methods to optimize Sparse Mixture-of-Experts (SMoE) models, which are crucial for scaling large language models efficiently. One paper reveals a geometric coupling between routers and expe…

New SSMoE framework uses eigenvectors to fix SMoE model collapse

JetBrains releases efficient Mellum2 MoE model; research advances MoE techniques

LLMs explore preference alignment and failure mitigation techniques

New research optimizes Sparse Mixture-of-Experts for efficient LLM scaling