digital-human - a zzfive Collection

zzfive 's Collections

inference optimization

RL+reason model

digital-human

updated about 1 month ago

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Paper • 2412.01106 • Published Dec 2, 2024 • 24
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

Paper • 2412.04448 • Published Dec 5, 2024 • 10
IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Paper • 2412.14963 • Published Dec 19, 2024 • 6
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3, 2025 • 225
Pippo: High-Resolution Multi-View Humans from a Single Image

Paper • 2502.07785 • Published Feb 11, 2025 • 10
X-Dancer: Expressive Music to Human Dance Video Generation

Paper • 2502.17414 • Published Feb 24, 2025 • 14
Motion Anything: Any to Motion Generation

Paper • 2503.06955 • Published Mar 10, 2025 • 35
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Paper • 2503.12963 • Published Mar 17, 2025 • 7
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Paper • 2503.21144 • Published Mar 27, 2025 • 27
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30, 2025 • 141
AvatarArtist: Open-Domain 4D Avatarization

Paper • 2503.19906 • Published Mar 25, 2025 • 8
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance

Paper • 2504.01724 • Published Apr 2, 2025 • 68
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Paper • 2504.02542 • Published Apr 3, 2025 • 52
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

Paper • 2504.04842 • Published Apr 7, 2025 • 35
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Paper • 2505.00497 • Published May 1, 2025 • 17
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation

Paper • 2505.10238 • Published May 15, 2025 • 10
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

Paper • 2506.00830 • Published Jun 1, 2025 • 7
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

Paper • 2507.12956 • Published Jul 17, 2025 • 25
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation

Paper • 2508.11255 • Published Aug 15, 2025 • 11
OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

Paper • 2508.19209 • Published Aug 26, 2025 • 42
MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation

Paper • 2508.19320 • Published Aug 26, 2025 • 29
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis

Paper • 2509.09595 • Published Sep 11, 2025 • 48
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 176
PersonaLive! Expressive Portrait Image Animation for Live Streaming

Paper • 2512.11253 • Published Dec 12, 2025 • 39
KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published Dec 15, 2025 • 44
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Paper • 2601.00664 • Published Jan 2 • 57
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Paper • 2602.12160 • Published Feb 12 • 38