Two new research papers explore methods to mitigate catastrophic forgetting in language models during fine-tuning. One paper introduces Sparse Memory Finetuning (SMF), which adds memory layers and updates only heavily accessed rows, showing improved performance on a medical exam task with minimal loss of general capabilities. The other paper investigates Sharpness-Aware Minimization (SAM) and other pretraining optimization techniques, demonstrating that biasing towards flatter minima can significantly reduce forgetting across various model sizes and post-training scenarios. AI
Summary written by gemini-2.5-flash-lite from 4 sources. How we write summaries →
IMPACT These techniques could lead to more robust and adaptable language models that retain general knowledge while learning new tasks.
RANK_REASON Two arXiv papers present novel methods for mitigating catastrophic forgetting in language models.