Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 8 days ago • 84
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 626
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 26 days ago • 261