Tilde Research Introduces Aurora: A Leverage-Aware Optimizer That Fixes a Hidden Neuron Death Problem in Muon
Tilde Research has introduced Aurora, a novel optimizer designed to train neural networks more effectively. Aurora addresses a critical issue in the popular Muon optimizer where a significant number of neurons become permanently inactive during training. The new optimizer, demonstrated with a 1.1B parameter pretraining experiment, achieves state-of-the-art performance on the modded-nanoGPT speedrun benchmark and has its code released publicly. AI
IMPACT Fixes a critical flaw in a widely-used optimizer, potentially improving training efficiency and model performance for large-scale models.