YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
VibeVoice F16 Model (Incremental Conversion)
This model has been converted to float16 (f16) precision using incremental processing to avoid RAM exhaustion.
Conversion Details
- Original model: sheliak/VibeVoice-Large_Mirror
- Conversion method: Incremental shard-by-shard processing
- Mixed precision: True
- Total parameters: 9,343,355,363
- F16 parameters: 8,073,169,443 (86.4%)
- F32 parameters: 0 (0.0%)
- Model size: 15.04GB
- Memory savings: ~0.0%
Usage
from vibevoice.modular.modeling_vibevoice_inference import VibeVoiceForConditionalGenerationInference
from vibevoice.processor.vibevoice_processor import VibeVoiceProcessor
# Load with f16 precision
model = VibeVoiceForConditionalGenerationInference.from_pretrained(
"./VibeVoice-Large-f16",
torch_dtype=torch.float16,
device_map="cpu"
)
processor = VibeVoiceProcessor.from_pretrained("./VibeVoice-Large-f16")
# Use --use_f16 flag with demo scripts
python demo/inference_from_file.py --model_path ./VibeVoice-Large-f16 --use_f16 --device cpu
RAM-Friendly Conversion
This model was converted using incremental processing, making it possible to convert large models on systems with limited RAM.
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support