Speech Emotion Classification — ONNX (INT8)

ONNX INT8-quantized version of prithivMLmods/Speech-Emotion-Classification for on-device inference in macOS apps via ONNX Runtime C API.

Model Details

Architecture: Wav2Vec2ForSequenceClassification (facebook/wav2vec2-base-960h fine-tuned)
Format: ONNX INT8 quantized
Size: ~91 MB (INT8), ~361 MB (FP32)
Input: Raw audio waveform (16kHz, mono), shape [1, num_samples]
Output: 8-class emotion logits

Emotion Labels

ID	Label	Full
0	ANG	Anger
1	CAL	Calm
2	DIS	Disgust
3	FEA	Fear
4	HAP	Happy
5	NEU	Neutral
6	SAD	Sad
7	SUR	Surprised

Usage

On-device real-time speech emotion classification. Inference via ONNX Runtime C API.

// Swift — load and run via OnnxRuntimeWrapper
let wrapper = OnnxRuntimeWrapper()
try wrapper.load(modelPath: "model_int8.onnx")
let logits = try wrapper.run(inputName: "input_values", inputData: audioBuffer, inputShape: [1, Int64(audioBuffer.count)])
let emotionIdx = logits.firstIndex(of: logits.max()!)!

Files

model_int8.onnx — INT8 quantized model (recommended for on-device use)
model.onnx — FP32 full precision model
config.json — Model configuration with label mappings

Attribution

Original model by prithivMLmods. Converted to ONNX by onnx-community. INT8 quantization and packaging for macOS by @smkrv.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for smkrv/speech-emotion-classification-onnx

Base model

facebook/wav2vec2-base-960h

Finetuned

prithivMLmods/Speech-Emotion-Classification

Quantized

(3)

this model