Efficient Agentic Reasoning Through Self-Regulated Simulative Planning Paper • 2605.22138 • Published 5 days ago • 7
AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs Paper • 2605.15565 • Published 11 days ago • 16
TorchUMM: A Unified Multimodal Model Codebase for Evaluation, Analysis, and Post-training Paper • 2604.10784 • Published Apr 12 • 7
PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint Paper • 2509.10971 • Published Sep 13, 2025 • 2
Online hierarchical partitioning of the output space in extreme multi-label data stream Paper • 2507.20894 • Published Jul 28, 2025 • 1
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40
SmartPlay : A Benchmark for LLMs as Intelligent Agents Paper • 2310.01557 • Published Oct 2, 2023 • 13
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Paper • 2403.04132 • Published Mar 7, 2024 • 41
S-LoRA: Serving Thousands of Concurrent LoRA Adapters Paper • 2311.03285 • Published Nov 6, 2023 • 30
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 43
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents Paper • 2305.02412 • Published May 3, 2023 • 1
SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning Paper • 2305.15486 • Published May 24, 2023 • 1