How Far Are We from Genuinely Useful Deep Research Agents? Paper • 2512.01948 • Published 6 days ago • 50
How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity Paper • 2511.08487 • Published 26 days ago • 2
Rectifying LLM Thought from Lens of Optimization Paper • 2512.01925 • Published 6 days ago • 23
Rectifying LLM Thought from Lens of Optimization Paper • 2512.01925 • Published 6 days ago • 23
Rectifying LLM Thought from Lens of Optimization Paper • 2512.01925 • Published 6 days ago • 23 • 2
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning Paper • 2511.14366 • Published 19 days ago • 15
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning Paper • 2511.14366 • Published 19 days ago • 15
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated about 1 month ago • 56
NVIDIA Nemotron V2 Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 3 days ago • 92
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published Aug 21 • 4
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21 • 256
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21 • 256
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published Aug 21 • 4
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis Paper • 2508.15754 • Published Aug 21 • 4 • 2