Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.11340

Unified Multimodal Model

A curated list for Multimodal Model Generation papers.

OmniGen2: Exploration to Advanced Multimodal Generation

Paper • 2506.18871 • Published Jun 23 • 78
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation

Paper • 2502.05415 • Published Feb 8 • 21
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

Papers - Image - CoT

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 130

Omni-Generation

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Video-Guided Foley Sound Generation with Multimodal Controls

Paper • 2411.17698 • Published Nov 26, 2024 • 10
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Paper • 2412.01064 • Published Dec 2, 2024 • 47
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Paper • 2412.01169 • Published Dec 2, 2024 • 13

Diffusion-Papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

image generation

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

Prompt Expansion

IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12, 2024 • 22
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9, 2024 • 19
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

Running on Zero

251

FLUX.1 Dev Inpainting Model Beta GPU

🏆

251

Repair and enhance images using prompts
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 222
Running on Zero

24

Kartoffel-TTS (Based on Chatterbox) - German Text-to-Speech Demo

📢

24

Expressive Zeroshot TTS

Interesting papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

Unified Multimodal Model

A curated list for Multimodal Model Generation papers.

OmniGen2: Exploration to Advanced Multimodal Generation

Paper • 2506.18871 • Published Jun 23 • 78
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation

Paper • 2502.05415 • Published Feb 8 • 21
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

Prompt Expansion

IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

Paper • 2409.08240 • Published Sep 12, 2024 • 22
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Paper • 2410.07133 • Published Oct 9, 2024 • 19
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

Papers - Image - CoT

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 130

Running on Zero

251

FLUX.1 Dev Inpainting Model Beta GPU

🏆

251

Repair and enhance images using prompts
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 222
Running on Zero

24

Kartoffel-TTS (Based on Chatterbox) - German Text-to-Speech Demo

📢

24

Expressive Zeroshot TTS

Omni-Generation

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Video-Guided Foley Sound Generation with Multimodal Controls

Paper • 2411.17698 • Published Nov 26, 2024 • 10
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Paper • 2412.01064 • Published Dec 2, 2024 • 47
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Paper • 2412.01169 • Published Dec 2, 2024 • 13

Interesting papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published Sep 17, 2024 • 74

Diffusion-Papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

image generation

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs