-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 8.21k • 1.23k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 116 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2504.15785
-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 29 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
OTC: Optimal Tool Calls via Reinforcement Learning
Paper • 2504.14870 • Published • 35
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 28 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 43 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 73 -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 1.13M • • 12.9k -
deepseek-ai/DeepSeek-V3
Text Generation • 685B • Updated • 883k • • 4.01k -
krutrim-ai-labs/Krutrim-2-instruct
Updated • 150 • 35
-
Training-Free Group Relative Policy Optimization
Paper • 2510.08191 • Published • 44 -
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Paper • 2510.15444 • Published • 147 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 47 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22
-
A Fully Spectral Neuro-Symbolic Reasoning Architecture with Graph Signal Processing as the Computational Backbone
Paper • 2508.14923 • Published • 1 -
A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design
Paper • 2508.03665 • Published • 1 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 23 -
HyDRA: A Hybrid-Driven Reasoning Architecture for Verifiable Knowledge Graphs
Paper • 2507.15917 • Published
-
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 52 -
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Paper • 2412.12094 • Published • 11 -
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Paper • 2306.07691 • Published • 12 -
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Paper • 2203.02395 • Published • 1
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 24 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 87 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
Compositional Foundation Models for Hierarchical Planning
Paper • 2309.08587 • Published • 11 -
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Paper • 2405.09220 • Published • 28 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
Paper • 2508.20096 • Published • 36
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 8.21k • 1.23k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 116 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Training-Free Group Relative Policy Optimization
Paper • 2510.08191 • Published • 44 -
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Paper • 2510.15444 • Published • 147 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 47 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22
-
A Fully Spectral Neuro-Symbolic Reasoning Architecture with Graph Signal Processing as the Computational Backbone
Paper • 2508.14923 • Published • 1 -
A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design
Paper • 2508.03665 • Published • 1 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 23 -
HyDRA: A Hybrid-Driven Reasoning Architecture for Verifiable Knowledge Graphs
Paper • 2507.15917 • Published
-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 29 -
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Paper • 2504.16078 • Published • 21 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
OTC: Optimal Tool Calls via Reinforcement Learning
Paper • 2504.14870 • Published • 35
-
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 52 -
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Paper • 2412.12094 • Published • 11 -
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Paper • 2306.07691 • Published • 12 -
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Paper • 2203.02395 • Published • 1
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 28 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 43 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 24 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 87 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 73 -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 1.13M • • 12.9k -
deepseek-ai/DeepSeek-V3
Text Generation • 685B • Updated • 883k • • 4.01k -
krutrim-ai-labs/Krutrim-2-instruct
Updated • 150 • 35
-
Compositional Foundation Models for Hierarchical Planning
Paper • 2309.08587 • Published • 11 -
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Paper • 2405.09220 • Published • 28 -
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Paper • 2504.15785 • Published • 22 -
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
Paper • 2508.20096 • Published • 36