Moshi: a speech-text foundation model for real-time dialogue Paper • 2410.00037 • Published Sep 17, 2024 • 10
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 205