Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2602.05400

Stuff I'm going to read

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 178
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 53
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 27

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 195
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 104
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

The Trinity of Consistency as a Defining Principle for General World Models

Paper • 2602.23152 • Published Feb 26 • 201
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published Feb 26 • 151
OmniGAIA: Towards Native Omni-Modal AI Agents

Paper • 2602.22897 • Published Feb 26 • 53
Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published Feb 26 • 44

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 227
FASA: Frequency-aware Sparse Attention

Paper • 2602.03152 • Published Feb 3 • 154
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 326

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

The Starter Pack Collection

Qwen/Qwen3.6-35B-A3B

Image-Text-to-Text • 36B • Updated 23 days ago • 5.26M • • 1.78k
deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 10 days ago • 2.97M • • 3.99k
moonshotai/Kimi-K2.6

Image-Text-to-Text • 1.1T • Updated 5 days ago • 2.15M • • 1.29k
openai/privacy-filter

Token Classification • 1B • Updated 24 days ago • 239k • 1.45k

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 149
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

Paper • 2602.07845 • Published Feb 8 • 71
LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Paper • 2602.08676 • Published Feb 9 • 72
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Paper • 2602.02474 • Published Feb 2 • 63

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 265
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

Stuff I'm going to read

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 178
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 53
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 27

The Starter Pack Collection

Qwen/Qwen3.6-35B-A3B

Image-Text-to-Text • 36B • Updated 23 days ago • 5.26M • • 1.78k
deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 10 days ago • 2.97M • • 3.99k
moonshotai/Kimi-K2.6

Image-Text-to-Text • 1.1T • Updated 5 days ago • 2.15M • • 1.29k
openai/privacy-filter

Token Classification • 1B • Updated 24 days ago • 239k • 1.45k

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 195
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published Mar 3 • 104
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 149
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

Paper • 2602.07845 • Published Feb 8 • 71
LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Paper • 2602.08676 • Published Feb 9 • 72
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Paper • 2602.02474 • Published Feb 2 • 63

The Trinity of Consistency as a Defining Principle for General World Models

Paper • 2602.23152 • Published Feb 26 • 201
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published Feb 26 • 151
OmniGAIA: Towards Native Omni-Modal AI Agents

Paper • 2602.22897 • Published Feb 26 • 53
Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published Feb 26 • 44

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 227
FASA: Frequency-aware Sparse Attention

Paper • 2602.03152 • Published Feb 3 • 154
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 326

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 265
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 353

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs