How to Build a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac for Healthcare Oct 28 • 18
NVIDIA Releases 8 Million Sample Open Dataset and Tooling for OCR, Image Reasoning, Image and Video QA Tasks Oct 28 • 16
Llama‑Embed‑Nemotron‑8B Text Embedding Model Ranks First on Multilingual MTEB Leaderboard Oct 21 • 14
📢 NVIDIA Releases Nemotron-CC-Math Pre-Training Dataset: A High-Quality, Web-Scale Math Corpus for Pretraining Large Language Models Aug 18 • 5
NVIDIA Releases Improved Pretraining Dataset: Preserves High Value Math & Code, and Augments with Multi-Lingual Aug 18 • 3
NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11 • 75
Llama-NeMoRetriever-ColEmbed: Developer-Focused Guide to NVIDIA's State-of-the-Art Text-Image Retrieval Jul 9 • 4
Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions Jun 10 • 21
Submitted by Siyi Chen 21 SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL NVIDIA 7 2
Submitted by Shizhe Diao 94 ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration NVIDIA 267 3
Submitted by Yonggan Fu 29 Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models NVIDIA 2
Submitted by Min-Hung Chen 4 VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models NVIDIA 3
Submitted by Yauhen Babakhin 11 Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks NVIDIA 2
Submitted by Huck Yang 6 Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale NVIDIA 2
Submitted by Byung-Kwan Lee 28 Unified Reinforcement and Imitation Learning for Vision-Language Models NVIDIA 7
Submitted by Shizhe Diao 7 ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge NVIDIA 2
Submitted by taesiri 89 OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM NVIDIA 593 4
Submitted by Min-Hung Chen 15 DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning NVIDIA 3
Submitted by Ankit Goyal 11 VLA-0: Building State-of-the-Art VLAs with Zero Modification NVIDIA 322 2
Submitted by Wei Huang 176 QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs NVIDIA 459 4
Submitted by Min-Hung Chen 7 TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control NVIDIA 2
Submitted by Jay Wu 16 ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation NVIDIA 616 2
Submitted by Han Cai 37 DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder NVIDIA 2
Submitted by Han Cai 6 DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space NVIDIA 2
Submitted by Yuyang 45 SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer NVIDIA 4.77k 2
Submitted by Shrimai Prabhumoye 23 Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data NVIDIA 4
Submitted by Zhilin Wang 5 RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards NVIDIA 2
Submitted by Chi-Pin Huang 39 ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning NVIDIA 1
Submitted by Min-Hung Chen 33 Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks NVIDIA 2
Submitted by Pavlo Molchanov 45 Hymba: A Hybrid-head Architecture for Small Language Models NVIDIA 203 3
Submitted by Min-Hung Chen 7 EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation NVIDIA 27 2