Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 9 days ago • 61
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 11 days ago • 302
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 18 days ago • 73
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 14 days ago • 10
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use Paper • 2603.08262 • Published 19 days ago • 42
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents Paper • 2603.19685 • Published 8 days ago • 18
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 11 days ago • 132