Kinyarwanda Whisper Evaluation
This repository evaluates the Whisper model performance on Kinyarwanda, as described in the paper How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu.
Model Description
The development of Automatic Speech Recognition (ASR) systems for low-resource African languages remains challenging due to limited transcribed speech data. This work addresses fundamental concerns for practitioners, evaluating Whisper's performance through comprehensive experiments on Kinyarwanda. Systematic data scaling analysis on Kinyarwanda, using training sets from 1 to 1,400 hours, demonstrated that practical ASR performance (WER < 13%) becomes achievable with as little as 50 hours of training data, with substantial improvements continuing through 200 hours (WER < 10%).
For more details on the evaluation, training, and related models, visit the GitHub repository.
Training Configs
The following models were used in the Kinyarwanda Whisper evaluation, trained with different data volumes. Explore the full collection: 👉 https://huggingface.co/collections/Sunbird/kinyarwanda-hackathon-68872541c41c5d166d9bffad
| Config | Hours | Model ID on Hugging Face |
|---|---|---|
baseline.yaml |
0 | openai/whisper-large-v3 |
train_1h.yaml |
1 | akera/whisper-large-v3-kin-1h-v2 |
train_50h.yaml |
50 | akera/whisper-large-v3-kin-50h-v2 |
train_100h.yaml |
100 | akera/whisper-large-v3-kin-100h-v2 |
train_150h.yaml |
150 | akera/whisper-large-v3-kin-150h-v2 |
train_200h.yaml |
200 | akera/whisper-large-v3-kin-200h-v2 |
train_500h.yaml |
500 | akera/whisper-large-v3-kin-500h-v2 |
train_1000h.yaml |
1000 | akera/whisper-large-v3-kin-1000h-v2 |
train_full.yaml |
~1400 | akera/whisper-large-v3-kin-full |
Evaluation Results
Evaluation on dev_test[:300] subset:
| Model | Hours | WER (%) | CER (%) | Score |
|---|---|---|---|---|
openai/whisper-large-v3 |
0 | 33.10 | 9.80 | 0.861 |
akera/whisper-large-v3-kin-1h-v2 |
1 | 47.63 | 16.97 | 0.754 |
akera/whisper-large-v3-kin-50h-v2 |
50 | 12.51 | 3.31 | 0.932 |
akera/whisper-large-v3-kin-100h-v2 |
100 | 10.90 | 2.84 | 0.943 |
akera/whisper-large-v3-kin-150h-v2 |
150 | 10.21 | 2.64 | 0.948 |
akera/whisper-large-v3-kin-200h-v2 |
200 | 9.82 | 2.56 | 0.951 |
akera/whisper-large-v3-kin-500h-v2 |
500 | 8.24 | 2.15 | 0.963 |
akera/whisper-large-v3-kin-1000h-v2 |
1000 | 7.65 | 1.98 | 0.967 |
akera/whisper-large-v3-kin-full |
~1400 | 7.14 | 1.88 | 0.970 |
Score = 1 - (0.6 × CER + 0.4 × WER)
- Downloads last month
- 2