ENTITY FineWeb-Edu

FineWeb-Edu

PulseAugur coverage of FineWeb-Edu — every cluster mentioning FineWeb-Edu across labs, papers, and developer communities, ranked by signal.

Total · 30d

5

5 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

5

5 over 90d

TIER MIX · 90D

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

RESEARCH · CL_28256 · May 11 · 16:26

Muown optimizer improves LLM training by controlling row-norm drift

Researchers have developed Muown, a novel optimization method designed to improve the training of large language models. Muown addresses issues with the Muon optimizer, specifically the upward drift of spectral norms in…
TOOL · CL_25579 · May 8 · 14:47

OrScale optimization method improves neural network training

Researchers have introduced OrScale, a novel optimization technique designed to enhance neural network training. OrScale builds upon the Muon method by incorporating layer-wise trust-ratio scaling, which measures the Fr…
TOOL · CL_15985 · May 5 · 04:00

Researchers explore growing Transformers with modular composition and layer-wise expansion

Researchers have explored a method for training Transformer models by incrementally adding new layers to a frozen base, maintaining a constant budget for trainable parameters. This approach, termed 'Growing Transformers…
RESEARCH · CL_14902 · May 4 · 19:11

OpenMythos project reconstructs Anthropic's secretive Claude Mythos AI model

A new open-source project called OpenMythos has been released, aiming to theoretically reconstruct the architecture of Anthropic's Claude Mythos model. This project implements a Recurrent-Depth Transformer (RDT) with a …