δ-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 3 days ago • 99
Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language Paper • 2603.23529 • Published Mar 7 • 1
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 159
📝 Research & Long-Form Blog Posts Collection In-depth technical articles and research pieces published by Hugging Face • 14 items • Updated 10 days ago • 21
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 74
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
Llama-Embed-Nemotron-8B Collection State-of-the-Art Text Embedding Model • 3 items • Updated 6 days ago • 6
Nemotron RAG Collection Set of tools to build retrieval-augmented generation (RAG) systems, improve search and ranking accuracy, and extract structured data from complex docs • 10 items • Updated 6 days ago • 92
Languages identification Collection a variety of pre-trained language identification models • 9 items • Updated Jul 31, 2025 • 2
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 148
view changelog Hugging Face Changelog Repositories total file size is now displayed Sep 18, 2025 • 175
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 775