Magma: A Foundation Model for Multimodal AI Agents Paper β’ 2502.13130 β’ Published Feb 18, 2025 β’ 58
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation Paper β’ 2502.20583 β’ Published Feb 27, 2025 β’ 13
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper β’ 2501.06186 β’ Published Jan 10, 2025 β’ 65
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation Paper β’ 2409.12576 β’ Published Sep 19, 2024 β’ 16