Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.18855

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21, 2025 • 73

about 21 hours ago

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2, 2025 • 108
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21, 2025 • 73
INTELLECT-3: Technical Report

Paper • 2512.16144 • Published Dec 18, 2025 • 20

about 14 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 14.9k • 1.3k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 47 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 59
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Paper • 2509.25810 • Published Sep 30, 2025 • 6
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26, 2025 • 40
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2, 2025 • 35
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1, 2025 • 24

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 124
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21, 2025 • 73

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 59
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Paper • 2509.19284 • Published Sep 23, 2025 • 23
Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Paper • 2509.25810 • Published Sep 30, 2025 • 6
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 273

about 21 hours ago

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2, 2025 • 108
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21, 2025 • 73
INTELLECT-3: Technical Report

Paper • 2512.16144 • Published Dec 18, 2025 • 20

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26, 2025 • 40
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2, 2025 • 35
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1, 2025 • 24

about 14 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 14.9k • 1.3k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 47 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 31
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 124
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs