YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

VibeVoice F16 Model (Incremental Conversion)

This model has been converted to float16 (f16) precision using incremental processing to avoid RAM exhaustion.

Conversion Details

  • Original model: sheliak/VibeVoice-Large_Mirror
  • Conversion method: Incremental shard-by-shard processing
  • Mixed precision: True
  • Total parameters: 9,343,355,363
  • F16 parameters: 8,073,169,443 (86.4%)
  • F32 parameters: 0 (0.0%)
  • Model size: 15.04GB
  • Memory savings: ~0.0%

Usage

from vibevoice.modular.modeling_vibevoice_inference import VibeVoiceForConditionalGenerationInference
from vibevoice.processor.vibevoice_processor import VibeVoiceProcessor

# Load with f16 precision
model = VibeVoiceForConditionalGenerationInference.from_pretrained(
    "./VibeVoice-Large-f16",
    torch_dtype=torch.float16,
    device_map="cpu"
)

processor = VibeVoiceProcessor.from_pretrained("./VibeVoice-Large-f16")

# Use --use_f16 flag with demo scripts
python demo/inference_from_file.py --model_path ./VibeVoice-Large-f16 --use_f16 --device cpu

RAM-Friendly Conversion

This model was converted using incremental processing, making it possible to convert large models on systems with limited RAM.

Downloads last month
4
Safetensors
Model size
9B params
Tensor type
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support