Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published Feb 1 • 44
GRACE: Generative Representation Learning via Contrastive Policy Optimization Paper • 2510.04506 • Published Oct 6, 2025 • 12