Would you still call this Dax? Novel Visual References in VLMs and Humans Paper • 2606.05409 • Published 12 days ago • 8
CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents Paper • 2605.25624 • Published 21 days ago • 33
Forecasting Downstream Performance of LLMs With Proxy Metrics Paper • 2605.18607 • Published 28 days ago • 14
A3: Agent-as-Annotators Collection Models and data from "Structured Distillation of Web Agent Capabilities Enables Generalization" (arXiv:2604.07776) • 6 items • Updated Apr 14 • 1
Structured Distillation of Web Agent Capabilities Enables Generalization Paper • 2604.07776 • Published Apr 9 • 23