grokking
PulseAugur coverage of grokking — every cluster mentioning grokking across labs, papers, and developer communities, ranked by signal.
-
Singular Learning Theory offers new perspective on AI model grokking
Researchers have explored the phenomenon of "grokking," where machine learning models abruptly shift from memorization to generalization after extended training. Using Singular Learning Theory (SLT), they propose that g…
-
Topology research reveals neural network grokking signatures and architectural bypasses
Researchers are exploring the phenomenon of 'grokking' in neural networks, where models initially memorize data before generalizing. One study proposes modifying architectural topology, such as enforcing spherical const…
-
Convergence Rate Analysis of the AdamW-Style Shampoo: Unifying One-sided and Two-Sided Preconditioning
A new theory, the Norm-Separation Delay Law, explains the phenomenon of grokking, where models generalize long after memorizing training data. Researchers demonstrated that grokking is a representational phase transitio…