Researchers have found that Supervised Fine-Tuning (SFT) using outputs from a different AI model can significantly degrade the capabilities of the trained model. This degradation appears to be linked to the model adopting an unfamiliar reasoning style that it struggles to utilize effectively. The issue is not necessarily due to imitating a less capable teacher model, as degradation occurs even when the teacher is superior. Fortunately, this performance drop seems to be a shallow property, as a small amount of training to restore the original reasoning style can recover most of the lost performance. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Understanding how off-model SFT impacts AI capabilities is crucial for developing safer and more aligned AI systems.
RANK_REASON The cluster describes research findings on the effects of a specific AI training technique. [lever_c_demoted from research: ic=1 ai=1.0]