YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
MuseTalk V1.5 Installer
This repository contains an automated installation script for MuseTalk V1.5, a real-time lip-sync model from Tencent.
Features
- Automated setup: Installs Miniconda with Python 3.10 environment
- CUDA 12.x support: PyTorch 2.1.0 with CUDA 12.1
- All dependencies: OpenMMLab stack (mmcv, mmdet, mmpose)
- Model downloads: DWPose, SD-VAE, Whisper models
- Version compatibility: Tested with compatible library versions
Tested Environment
- GPU: NVIDIA RTX 4090 (24GB VRAM)
- CUDA: 12.x
- OS: Ubuntu 22.04
- Python: 3.10 (via Miniconda)
Quick Start
# Download the installer
wget https://huggingface.co/dumont/musetalk-2024-12-24/raw/main/install_musetalk.sh
# Make it executable
chmod +x install_musetalk.sh
# Run as root (required for system packages)
sudo ./install_musetalk.sh
What the script installs
- System packages: wget, git, ffmpeg, espeak-ng
- Miniconda: Python 3.10 environment
- PyTorch: 2.1.0 with CUDA 12.1
- OpenMMLab: mmengine 0.10.4, mmcv 2.1.0, mmdet 3.3.0, mmpose 1.3.1
- Compatible versions:
- numpy==1.26.4
- transformers==4.42.0
- diffusers==0.28.0
- huggingface_hub==0.23.2
- opencv-python==4.8.0.74
- Models:
- DWPose (dw-ll_ucoco_384.pth)
- SD-VAE-FT-MSE
- OpenAI Whisper Tiny
Usage
After installation:
# Activate environment
source /opt/miniconda/bin/activate musetalk
# Go to MuseTalk directory
cd /workspace/MuseTalk
# Create a config file
cat > configs/inference/my_test.yaml << EOF
task_0:
video_path: "data/video/sun.mp4"
audio_path: "path/to/your_audio.wav"
EOF
# Run inference (V1.5)
PYTHONPATH=. python3 scripts/inference.py --version v15 --inference_config configs/inference/my_test.yaml
# Run inference (V1.0 original)
PYTHONPATH=. python3 scripts/inference.py --version v1 --inference_config configs/inference/my_test.yaml
Output videos are saved in ./results/v15/ or ./results/v1/
Notes
- The MuseTalk model weights (~3.4GB) are downloaded automatically via git-lfs from the official repository
- For custom avatars, prepare a video file with a clear face view
- Audio should be WAV format (mono recommended)
License
This installer script is provided as-is. MuseTalk is licensed under the original TMElyralab license.
Credits
- MuseTalk by TMElyralab/Tencent
- Installer script by Dumont AI Team
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support