Kinyarwanda Whisper Evaluation

This repository evaluates the Whisper model performance on Kinyarwanda, as described in the paper How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu.

Model Description

The development of Automatic Speech Recognition (ASR) systems for low-resource African languages remains challenging due to limited transcribed speech data. This work addresses fundamental concerns for practitioners, evaluating Whisper's performance through comprehensive experiments on Kinyarwanda. Systematic data scaling analysis on Kinyarwanda, using training sets from 1 to 1,400 hours, demonstrated that practical ASR performance (WER < 13%) becomes achievable with as little as 50 hours of training data, with substantial improvements continuing through 200 hours (WER < 10%).

For more details on the evaluation, training, and related models, visit the GitHub repository.

Training Configs

The following models were used in the Kinyarwanda Whisper evaluation, trained with different data volumes. Explore the full collection: 👉 https://huggingface.co/collections/Sunbird/kinyarwanda-hackathon-68872541c41c5d166d9bffad

Config Hours Model ID on Hugging Face
baseline.yaml 0 openai/whisper-large-v3
train_1h.yaml 1 akera/whisper-large-v3-kin-1h-v2
train_50h.yaml 50 akera/whisper-large-v3-kin-50h-v2
train_100h.yaml 100 akera/whisper-large-v3-kin-100h-v2
train_150h.yaml 150 akera/whisper-large-v3-kin-150h-v2
train_200h.yaml 200 akera/whisper-large-v3-kin-200h-v2
train_500h.yaml 500 akera/whisper-large-v3-kin-500h-v2
train_1000h.yaml 1000 akera/whisper-large-v3-kin-1000h-v2
train_full.yaml ~1400 akera/whisper-large-v3-kin-full

Evaluation Results

Evaluation on dev_test[:300] subset:

Model Hours WER (%) CER (%) Score
openai/whisper-large-v3 0 33.10 9.80 0.861
akera/whisper-large-v3-kin-1h-v2 1 47.63 16.97 0.754
akera/whisper-large-v3-kin-50h-v2 50 12.51 3.31 0.932
akera/whisper-large-v3-kin-100h-v2 100 10.90 2.84 0.943
akera/whisper-large-v3-kin-150h-v2 150 10.21 2.64 0.948
akera/whisper-large-v3-kin-200h-v2 200 9.82 2.56 0.951
akera/whisper-large-v3-kin-500h-v2 500 8.24 2.15 0.963
akera/whisper-large-v3-kin-1000h-v2 1000 7.65 1.98 0.967
akera/whisper-large-v3-kin-full ~1400 7.14 1.88 0.970

Score = 1 - (0.6 × CER + 0.4 × WER)

Downloads last month
2
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for akera/whisper-large-v3-kin-100h-v2