Collections
Discover the best community collections!
Collections including paper arxiv:2510.18855
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Paper • 2510.18855 • Published • 73 -
INTELLECT-3: Technical Report
Paper • 2512.16144 • Published • 20
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 14.9k • 1.3k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 47 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Large Reasoning Models Learn Better Alignment from Flawed Thinking
Paper • 2510.00938 • Published • 59 -
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
Paper • 2509.19284 • Published • 23 -
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
Paper • 2509.25810 • Published • 6 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Large Reasoning Models Learn Better Alignment from Flawed Thinking
Paper • 2510.00938 • Published • 59 -
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
Paper • 2509.19284 • Published • 23 -
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
Paper • 2509.25810 • Published • 6 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Paper • 2510.18855 • Published • 73 -
INTELLECT-3: Technical Report
Paper • 2512.16144 • Published • 20
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 14.9k • 1.3k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 47 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 31 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4