From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents Paper • 2603.22386 • Published 2 days ago • 34
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published about 20 hours ago • 35
SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM Paper • 2603.23386 • Published about 21 hours ago • 30
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 2 days ago • 104
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model Paper • 2603.22281 • Published 2 days ago • 8
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published about 20 hours ago • 25
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding Paper • 2603.22285 • Published 2 days ago • 45
VTC-Bench: Evaluating Agentic Multimodal Models via Compositional Visual Tool Chaining Paper • 2603.15030 • Published 9 days ago • 19
Alignment Makes Language Models Normative, Not Descriptive Paper • 2603.17218 • Published 8 days ago • 46
view article Article **LoRA Fine-Tuning BitNet b1.58 LLMs on Heterogeneous Edge GPUs via QVAC Fabric** 8 days ago • 14
MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction Paper • 2603.19231 • Published 6 days ago • 36
SegviGen: Repurposing 3D Generative Model for Part Segmentation Paper • 2603.16869 • Published 8 days ago • 18
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published 10 days ago • 21
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 2 days ago • 97
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published 3 days ago • 64
FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use Paper • 2603.08262 • Published 16 days ago • 43
Llama-Embed-Nemotron-8B Collection State-of-the-Art Text Embedding Model • 3 items • Updated about 16 hours ago • 5
view article Article SynthVision: Building a 110K Synthetic Medical VQA Dataset with Cross-Model Validation 2 days ago • 11