OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 Video-Text-to-Text • 2B • Updated Mar 16, 2025 • 1.76k • 26
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 23 days ago • 248k • 1.55k
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 55.7k • 1.6k