Papers - a TheOneTrueNiz Collection

TheOneTrueNiz 's Collections

Papers

Language Models

Papers

updated Oct 15

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 110
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23 • 22
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper • 2508.21112 • Published Aug 28 • 77
UItron: Foundational GUI Agent with Advanced Perception and Planning

Paper • 2508.21767 • Published Aug 29 • 12
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83
K2-Think: A Parameter-Efficient Reasoning System

Paper • 2509.07604 • Published Sep 9 • 13
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

Paper • 2509.13761 • Published Sep 17 • 16
FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 114
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

Paper • 2509.13305 • Published Sep 16 • 91
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Paper • 2509.16197 • Published Sep 19 • 56
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

Paper • 2509.13313 • Published Sep 16 • 79
Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld's Episode Theory

Paper • 2509.14662 • Published Sep 18 • 13
Meta-R1: Empowering Large Reasoning Models with Metacognition

Paper • 2508.17291 • Published Aug 24
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30 • 55
Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13 • 165