YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MuseTalk V1.5 Installer

This repository contains an automated installation script for MuseTalk V1.5, a real-time lip-sync model from Tencent.

Features

  • Automated setup: Installs Miniconda with Python 3.10 environment
  • CUDA 12.x support: PyTorch 2.1.0 with CUDA 12.1
  • All dependencies: OpenMMLab stack (mmcv, mmdet, mmpose)
  • Model downloads: DWPose, SD-VAE, Whisper models
  • Version compatibility: Tested with compatible library versions

Tested Environment

  • GPU: NVIDIA RTX 4090 (24GB VRAM)
  • CUDA: 12.x
  • OS: Ubuntu 22.04
  • Python: 3.10 (via Miniconda)

Quick Start

# Download the installer
wget https://huggingface.co/dumont/musetalk-2024-12-24/raw/main/install_musetalk.sh

# Make it executable
chmod +x install_musetalk.sh

# Run as root (required for system packages)
sudo ./install_musetalk.sh

What the script installs

  1. System packages: wget, git, ffmpeg, espeak-ng
  2. Miniconda: Python 3.10 environment
  3. PyTorch: 2.1.0 with CUDA 12.1
  4. OpenMMLab: mmengine 0.10.4, mmcv 2.1.0, mmdet 3.3.0, mmpose 1.3.1
  5. Compatible versions:
    • numpy==1.26.4
    • transformers==4.42.0
    • diffusers==0.28.0
    • huggingface_hub==0.23.2
    • opencv-python==4.8.0.74
  6. Models:
    • DWPose (dw-ll_ucoco_384.pth)
    • SD-VAE-FT-MSE
    • OpenAI Whisper Tiny

Usage

After installation:

# Activate environment
source /opt/miniconda/bin/activate musetalk

# Go to MuseTalk directory
cd /workspace/MuseTalk

# Create a config file
cat > configs/inference/my_test.yaml << EOF
task_0:
  video_path: "data/video/sun.mp4"
  audio_path: "path/to/your_audio.wav"
EOF

# Run inference (V1.5)
PYTHONPATH=. python3 scripts/inference.py --version v15 --inference_config configs/inference/my_test.yaml

# Run inference (V1.0 original)
PYTHONPATH=. python3 scripts/inference.py --version v1 --inference_config configs/inference/my_test.yaml

Output videos are saved in ./results/v15/ or ./results/v1/

Notes

  • The MuseTalk model weights (~3.4GB) are downloaded automatically via git-lfs from the official repository
  • For custom avatars, prepare a video file with a clear face view
  • Audio should be WAV format (mono recommended)

License

This installer script is provided as-is. MuseTalk is licensed under the original TMElyralab license.

Credits

  • MuseTalk by TMElyralab/Tencent
  • Installer script by Dumont AI Team
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support