Speech Emotion Classification β€” ONNX (INT8)

ONNX INT8-quantized version of prithivMLmods/Speech-Emotion-Classification for on-device inference in macOS apps via ONNX Runtime C API.

Model Details

  • Architecture: Wav2Vec2ForSequenceClassification (facebook/wav2vec2-base-960h fine-tuned)
  • Format: ONNX INT8 quantized
  • Size: ~91 MB (INT8), ~361 MB (FP32)
  • Input: Raw audio waveform (16kHz, mono), shape [1, num_samples]
  • Output: 8-class emotion logits

Emotion Labels

ID Label Full
0 ANG Anger
1 CAL Calm
2 DIS Disgust
3 FEA Fear
4 HAP Happy
5 NEU Neutral
6 SAD Sad
7 SUR Surprised

Usage

On-device real-time speech emotion classification. Inference via ONNX Runtime C API.

// Swift β€” load and run via OnnxRuntimeWrapper
let wrapper = OnnxRuntimeWrapper()
try wrapper.load(modelPath: "model_int8.onnx")
let logits = try wrapper.run(inputName: "input_values", inputData: audioBuffer, inputShape: [1, Int64(audioBuffer.count)])
let emotionIdx = logits.firstIndex(of: logits.max()!)!

Files

  • model_int8.onnx β€” INT8 quantized model (recommended for on-device use)
  • model.onnx β€” FP32 full precision model
  • config.json β€” Model configuration with label mappings

Attribution

Original model by prithivMLmods. Converted to ONNX by onnx-community. INT8 quantization and packaging for macOS by @smkrv.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for smkrv/speech-emotion-classification-onnx