Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 12 days ago • 89
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 12 days ago • 11
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published 17 days ago • 40
Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling Paper • 2603.04791 • Published 10 days ago • 16
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video Paper • 2603.04291 • Published 11 days ago • 13
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published 4 days ago • 31