Collections
Discover the best community collections!
Collections including paper arxiv:2510.09116
-
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Paper • 2509.22576 • Published • 135 -
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Paper • 2509.21880 • Published • 53 -
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Paper • 2510.09116 • Published • 96
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 32 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27
-
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 45 -
Multi-Agent System for Comprehensive Soccer Understanding
Paper • 2505.03735 • Published • 25 -
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Paper • 2510.09116 • Published • 96 -
basicv8vc/SimpleQA
Viewer • Updated • 4.33k • 8.95k • 30
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 32 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27
-
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Paper • 2509.22576 • Published • 135 -
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping
Paper • 2509.21880 • Published • 53 -
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Paper • 2510.09116 • Published • 96
-
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 45 -
Multi-Agent System for Comprehensive Soccer Understanding
Paper • 2505.03735 • Published • 25 -
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation
Paper • 2510.09116 • Published • 96 -
basicv8vc/SimpleQA
Viewer • Updated • 4.33k • 8.95k • 30
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 53 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 73 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 4