Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 5 days ago • 47
Video Generation Models Are Good Latent Reward Models Paper • 2511.21541 • Published 11 days ago • 44
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published 12 days ago • 46
Insights from the ICLR Peer Review and Rebuttal Process Paper • 2511.15462 • Published 18 days ago • 6
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published 17 days ago • 104
First Frame Is the Place to Go for Video Content Customization Paper • 2511.15700 • Published 18 days ago • 52
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 18 days ago • 42
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 20 days ago • 132
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism Paper • 2511.11373 • Published 23 days ago • 12
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published 24 days ago • 46
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published about 1 month ago • 52