CCTU: A Benchmark for Tool Use under Complex Constraints Paper • 2603.15309 • Published 1 day ago • 1
CCTU: A Benchmark for Tool Use under Complex Constraints Paper • 2603.15309 • Published 1 day ago • 1
DFPO: Scaling Value Modeling via Distributional Flow towards Robust and Generalizable LLM Post-Training Paper • 2602.05890 • Published Feb 5 • 1
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study Paper • 2506.12537 • Published Jun 14, 2025 • 1
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training Paper • 2502.04066 • Published Feb 6, 2025
TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities Paper • 2407.21693 • Published Jul 31, 2024
Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control Paper • 2601.03973 • Published Jan 7 • 2
The Role of Entropy in Visual Grounding: Analysis and Optimization Paper • 2512.06726 • Published Dec 7, 2025 • 1
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning Paper • 2510.24320 • Published Oct 28, 2025 • 21
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10, 2025 • 57