Bhili TTS
A text-to-speech model for Bhili (भीली), specifically the Dehvali Bhili dialect, an Indo-Aryan language spoken by the Bhil community in western India.
This model is a fine-tuned version of ai4bharat/indic-parler-tts, trained on 10 hours of studio-quality Bhili speech data.
The model accepts a Bhili sentence (in Devanagari script) and a natural-language voice description, and produces 44.1 kHz speech. Try out the model here!
Quick start
Install
pip install git+https://github.com/huggingface/parler-tts.git
pip install transformers soundfile
Generate speech
import torch
import soundfile as sf
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = ParlerTTSForConditionalGeneration.from_pretrained("ai4bharat/bhili-tts").to(device)
tokenizer = AutoTokenizer.from_pretrained("ai4bharat/bhili-tts")
description_tokenizer = AutoTokenizer.from_pretrained(model.config.text_encoder._name_or_path)
prompt = "अजैविक ताण व्यवस्थापन खातुर शेतकरी मल्चिंगु वापर केएते."
description = "A male speaker with a clear, moderate-paced voice. The recording is clean with no background noise."
input_ids = description_tokenizer(description, return_tensors="pt").input_ids.to(device)
prompt_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_ids)
audio = generation.cpu().numpy().squeeze()
sf.write("bhili_out.wav", audio, model.config.sampling_rate)
Voice control via descriptions
You can steer the output voice by changing the description. The base Indic-Parler model understands attributes like:
- Gender: male / female
- Pace: slow / moderate / fast
- Pitch: low / moderate / high
- Expressiveness: monotone / neutral / expressive
- Recording: clean studio / slight background noise / reverberant
Example descriptions:
# Calm, clear male narrator
"A male speaker with a moderate, calm voice and clean studio recording."
# Energetic female speaker
"A young female speaker with expressive, fast-paced delivery in a clean recording."
# Older male, slow and deliberate
"An older male speaker with a low pitch, speaking slowly and clearly."
- Downloads last month
- 84
Model tree for ai4bharat/bhili-tts
Base model
ai4bharat/indic-parler-tts