A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 50
MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model Paper • 2509.20706 • Published Sep 25 • 2
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets Paper • 2509.21245 • Published Sep 25 • 38
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24 • 80
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published Sep 25 • 103
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines Paper • 2509.21320 • Published Sep 25 • 101
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 277
Sadeed: Advancing Arabic Diacritization Through Small Language Model Paper • 2504.21635 • Published Apr 30 • 59
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 74
Agent models: Internalizing Chain-of-Action Generation into Reasoning models Paper • 2503.06580 • Published Mar 9 • 20
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 57