PrismAudio: Decomposed Chain-of-Thoughts and Multi-dimensional Rewards for Video-to-Audio Generation Paper • 2511.18833 • Published Nov 24, 2025 • 4
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 4 days ago • 108
dots.mocr Collection Multimodal OCR: Parse Anything from Documents • 2 items • Updated 8 days ago • 7
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 10 days ago • 298
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 11 days ago • 179
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published 16 days ago • 146
FireRedASR2S Collection FireRedASR2S is a SOTA, industrial-grade, all-in-one ASR system with ASR, VAD, LID, and Punc module. All modules achieve SOTA performance. • 7 items • Updated 14 days ago • 9