view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth mlabonne • Jul 29, 2024 • 372
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published Mar 25 • 28
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents Paper • 2510.23691 • Published Oct 27, 2025 • 56
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 50 items • Updated about 19 hours ago • 156
view article Article Introducing NVIDIA Cosmos Policy for Advanced Robot Control nvidia • Jan 29 • 48
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 140
view article Article State of open video generation models in Diffusers +1 sayakpaul, a-r-r-o-w, dn6 • Jan 27, 2025 • 70
view post Post 3210 releasing: smol vision 🌼 A repository with notebooks on shrinking, optimizing, speeding-up, customizing large vision models! https://github.com/merveenoyan/smol-vision 1 reply · 🔥 18 18 ❤️ 4 4 🚀 3 3 🤗 2 2 👍 1 1 🤝 1 1 🧠 1 1 🤯 1 1 ➕ 1 1 👀 1 1 😎 1 1 + Reply
TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis Paper • 2307.15042 • Published Jul 27, 2023 • 7 • 1
view article Article Arc Virtual Cell Challenge: A Primer FL33TW00D-HF, abhinadduri • Jul 18, 2025 • 66
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 480
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix codelion • Nov 3, 2025 • 65