Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers Paper • 2601.04890 • Published 21 days ago • 41
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 29 days ago • 62
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper • 2512.20578 • Published Dec 23, 2025 • 85
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 81
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 98
GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver Paper • 2510.17699 • Published Oct 20, 2025 • 25
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 176
Quantum Variational Activation Functions Empower Kolmogorov-Arnold Networks Paper • 2509.14026 • Published Sep 17, 2025 • 5
UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning Paper • 2508.18756 • Published Aug 26, 2025 • 36
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 469
Dens3R: A Foundation Model for 3D Geometry Prediction Paper • 2507.16290 • Published Jul 22, 2025 • 9
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2, 2025 • 69
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published Jan 9, 2025 • 95