LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Paper • 2604.11748 • Published 12 days ago • 14
Narrative-Driven Paper-to-Slide Generation via ArcDeck Paper • 2604.11969 • Published 14 days ago • 7
MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models Paper • 2603.28590 • Published 28 days ago • 22
MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models Paper • 2603.28590 • Published 28 days ago • 22
HandX: Scaling Bimanual Motion and Interaction Generation Paper • 2603.28766 • Published 28 days ago • 12
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration Paper • 2603.12226 • Published Mar 12 • 4
ULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation Paper • 2603.03279 • Published Mar 3 • 1
Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Paper • 2602.21320 • Published Feb 24 • 12
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation Paper • 2602.16705 • Published Feb 18 • 26
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation Paper • 2602.16705 • Published Feb 18 • 26
Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs Paper • 2602.07276 • Published Feb 7 • 11
CodeCircuit: Toward Inferring LLM-Generated Code Correctness via Attribution Graphs Paper • 2602.07080 • Published Feb 6 • 6
oMeBench: Towards Robust Benchmarking of LLMs in Organic Mechanism Elucidation and Reasoning Paper • 2510.07731 • Published Oct 9, 2025 • 6
TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons Paper • 2504.19982 • Published Apr 28, 2025
PIPA: A Unified Evaluation Protocol for Diagnosing Interactive Planning Agents Paper • 2505.01592 • Published May 2, 2025
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge Paper • 2507.04447 • Published Jul 6, 2025 • 45
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published May 30, 2025 • 97