New GFT framework unifies SFT and RL for more stable LLM training

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have introduced Group Fine-Tuning (GFT), a novel framework designed to unify supervised fine-tuning (SFT) and reinforcement learning (RL) for large language models. GFT addresses limitations in traditional SFT, such as single-path dependency and unstable weighting, by employing Group Advantage Learning and Dynamic Coefficient Rectification. Experiments indicate that GFT outperforms standard SFT methods and facilitates smoother integration with subsequent RL training. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a unified training framework that may improve LLM generalization and RL integration.

RANK_REASON This is a research paper detailing a new training framework for large language models.

Read on arXiv cs.LG →

COVERAGE [1]

arXiv cs.LG TIER_1 · Wangjie Gan, Miao Pan, Linbo Xi, Kaixiang Yao, Wenqi Zhang, Jintao Chen, Jianwei Yin, Xuhong Zhang · 2026-04-29 04:00

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

arXiv:2604.14258v2 Announce Type: replace-cross Abstract: Large language models are typically post-trained using supervised fine-tuning (SFT) and reinforcement learning (RL), yet effectively unifying efficient knowledge injection with robust generalization remains challenging. In…

COVERAGE [1]

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

RELATED ENTITIES

RELATED TOPICS