-
OmniGen2: Exploration to Advanced Multimodal Generation
Paper • 2506.18871 • Published • 78 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation
Paper • 2502.05415 • Published • 21 -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Paper • 2408.12528 • Published • 51
Collections
Discover the best community collections!
Collections including paper arxiv:2409.11340
-
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Video-Guided Foley Sound Generation with Multimodal Controls
Paper • 2411.17698 • Published • 10 -
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait
Paper • 2412.01064 • Published • 47 -
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Paper • 2412.01169 • Published • 13
-
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation
Paper • 2409.08240 • Published • 22 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 43 -
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Paper • 2410.07133 • Published • 19 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115
-
FLUX.1 Dev Inpainting Model Beta GPU
🏆251Repair and enhance images using prompts
-
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Paper • 2502.01061 • Published • 222 -
Kartoffel-TTS (Based on Chatterbox) - German Text-to-Speech Demo
📢24Expressive Zeroshot TTS
-
OmniGen2: Exploration to Advanced Multimodal Generation
Paper • 2506.18871 • Published • 78 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation
Paper • 2502.05415 • Published • 21 -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Paper • 2408.12528 • Published • 51
-
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation
Paper • 2409.08240 • Published • 22 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 43 -
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Paper • 2410.07133 • Published • 19 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115
-
FLUX.1 Dev Inpainting Model Beta GPU
🏆251Repair and enhance images using prompts
-
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Paper • 2502.01061 • Published • 222 -
Kartoffel-TTS (Based on Chatterbox) - German Text-to-Speech Demo
📢24Expressive Zeroshot TTS
-
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 115 -
Video-Guided Foley Sound Generation with Multimodal Controls
Paper • 2411.17698 • Published • 10 -
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait
Paper • 2412.01064 • Published • 47 -
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Paper • 2412.01169 • Published • 13