Segformer-Base: Optimized for Qualcomm Devices

Segformer Base is a machine learning model that predicts masks and classes of objects in an image.

This is based on the implementation of Segformer-Base found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
ONNX w8a16 Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
ONNX w8a8 Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
QNN_DLC float Universal QAIRT 2.45 Download
QNN_DLC w8a16 Universal QAIRT 2.45 Download
QNN_DLC w8a8 Universal QAIRT 2.45 Download
TFLITE float Universal QAIRT 2.45 Download
TFLITE w8a8 Universal QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit Segformer-Base on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for Segformer-Base on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.semantic_segmentation

Model Stats:

  • Model checkpoint: nvidia/segformer-b0-finetuned-ade-512-512
  • Input resolution: 512x512
  • Number of output classes: 150
  • Number of parameters: 3.75M
  • Model size (float): 14.4 MB
  • Model size (w8a16): 4.57 MB
  • Model size (w8a8): 3.90 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
Segformer-Base ONNX float Snapdragon® 8 Elite Gen 5 Mobile 74.253 ms 31 - 219 MB NPU
Segformer-Base ONNX float Snapdragon® X2 Elite 72.678 ms 209 - 209 MB NPU
Segformer-Base ONNX float Snapdragon® X Elite 112.391 ms 178 - 178 MB NPU
Segformer-Base ONNX float Snapdragon® 8 Gen 3 Mobile 82.181 ms 14 - 240 MB NPU
Segformer-Base ONNX float Qualcomm® QCS8550 (Proxy) 108.166 ms 19 - 170 MB NPU
Segformer-Base ONNX float Snapdragon® 8 Elite For Galaxy Mobile 74.209 ms 23 - 212 MB NPU
Segformer-Base ONNX float Qualcomm® QCS9075 113.543 ms 20 - 70 MB NPU
Segformer-Base ONNX float Qualcomm® QCS8750 74.209 ms 23 - 212 MB NPU
Segformer-Base ONNX float Qualcomm® QCS7181 112.391 ms 178 - 178 MB NPU
Segformer-Base ONNX w8a16 Snapdragon® 8 Elite Gen 5 Mobile 6.811 ms 13 - 219 MB NPU
Segformer-Base ONNX w8a16 Snapdragon® X2 Elite 6.873 ms 211 - 211 MB NPU
Segformer-Base ONNX w8a16 Snapdragon® 8 Gen 3 Mobile 10.442 ms 13 - 252 MB NPU
Segformer-Base ONNX w8a16 Qualcomm® QCS6490 722.622 ms 379 - 385 MB CPU
Segformer-Base ONNX w8a16 Qualcomm® QCS8550 (Proxy) 14.943 ms 10 - 210 MB NPU
Segformer-Base ONNX w8a16 Snapdragon® 7 Gen 4 Mobile 317.437 ms 380 - 391 MB CPU
Segformer-Base ONNX w8a16 Snapdragon® 8 Elite For Galaxy Mobile 8.597 ms 13 - 216 MB NPU
Segformer-Base ONNX w8a16 Qualcomm® QCM6690 350.577 ms 324 - 336 MB CPU
Segformer-Base ONNX w8a16 Qualcomm® QCS9075 20.456 ms 9 - 58 MB NPU
Segformer-Base ONNX w8a16 Qualcomm® QCS7790 317.437 ms 380 - 391 MB CPU
Segformer-Base ONNX w8a16 Qualcomm® QCS8750 8.597 ms 13 - 216 MB NPU
Segformer-Base ONNX w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.578 ms 7 - 205 MB NPU
Segformer-Base ONNX w8a8 Snapdragon® X2 Elite 4.564 ms 212 - 212 MB NPU
Segformer-Base ONNX w8a8 Snapdragon® 8 Gen 3 Mobile 7.595 ms 1 - 223 MB NPU
Segformer-Base ONNX w8a8 Qualcomm® QCS6490 273.483 ms 194 - 202 MB CPU
Segformer-Base ONNX w8a8 Qualcomm® QCS8550 (Proxy) 11.022 ms 2 - 50 MB NPU
Segformer-Base ONNX w8a8 Qualcomm® QCS9075 12.132 ms 4 - 53 MB NPU
Segformer-Base ONNX w8a8 Snapdragon® 7 Gen 4 Mobile 156.934 ms 127 - 140 MB CPU
Segformer-Base ONNX w8a8 Snapdragon® 8 Elite For Galaxy Mobile 5.568 ms 9 - 201 MB NPU
Segformer-Base ONNX w8a8 Qualcomm® QCM6690 174.045 ms 191 - 204 MB CPU
Segformer-Base ONNX w8a8 Qualcomm® QCS7790 156.934 ms 127 - 140 MB CPU
Segformer-Base ONNX w8a8 Qualcomm® QCS8750 5.568 ms 9 - 201 MB NPU
Segformer-Base QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 13.657 ms 3 - 193 MB NPU
Segformer-Base QNN_DLC float Snapdragon® X2 Elite 12.519 ms 3 - 3 MB NPU
Segformer-Base QNN_DLC float Snapdragon® X Elite 23.533 ms 3 - 3 MB NPU
Segformer-Base QNN_DLC float Snapdragon® 8 Gen 3 Mobile 16.364 ms 0 - 223 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS8275 48.898 ms 1 - 191 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS8550 (Proxy) 22.689 ms 3 - 18 MB NPU
Segformer-Base QNN_DLC float Qualcomm® SA8775P 24.446 ms 1 - 193 MB NPU
Segformer-Base QNN_DLC float Qualcomm® SA8650P 24.446 ms 1 - 193 MB NPU
Segformer-Base QNN_DLC float Qualcomm® SA8255P 24.446 ms 1 - 193 MB NPU
Segformer-Base QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 13.158 ms 0 - 198 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS8450 (Proxy) 29.286 ms 3 - 227 MB NPU
Segformer-Base QNN_DLC float Qualcomm® SA7255P 48.898 ms 1 - 191 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS9075 27.458 ms 3 - 17 MB NPU
Segformer-Base QNN_DLC float Qualcomm® SA8295P 27.861 ms 0 - 187 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS8750 13.158 ms 0 - 198 MB NPU
Segformer-Base QNN_DLC float Qualcomm® QCS7181 23.533 ms 3 - 3 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® 8 Elite Gen 5 Mobile 12.678 ms 2 - 241 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® X2 Elite 11.609 ms 2 - 2 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® X Elite 23.327 ms 2 - 2 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® 8 Gen 3 Mobile 19.102 ms 2 - 271 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS8275 37.761 ms 2 - 231 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS8550 (Proxy) 25.215 ms 2 - 304 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® SA8775P 24.565 ms 2 - 232 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® SA8650P 24.565 ms 2 - 232 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® SA8255P 24.565 ms 2 - 232 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® 7 Gen 4 Mobile 47.606 ms 2 - 261 MB NPU
Segformer-Base QNN_DLC w8a16 Snapdragon® 8 Elite For Galaxy Mobile 13.885 ms 2 - 234 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCM6690 120.787 ms 2 - 293 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® SA7255P 37.761 ms 2 - 231 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS9075 48.728 ms 1 - 9 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS7790 47.606 ms 2 - 261 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS8750 13.885 ms 2 - 234 MB NPU
Segformer-Base QNN_DLC w8a16 Qualcomm® QCS7181 23.327 ms 2 - 2 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® 8 Elite Gen 5 Mobile 2.233 ms 1 - 182 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® X2 Elite 2.608 ms 1 - 1 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® X Elite 6.367 ms 1 - 1 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® 8 Gen 3 Mobile 3.965 ms 0 - 198 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS6490 15.578 ms 0 - 5 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS8275 10.853 ms 1 - 175 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS8550 (Proxy) 5.799 ms 1 - 53 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® SA8775P 6.413 ms 1 - 174 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® SA8650P 6.413 ms 1 - 174 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® SA8255P 6.413 ms 1 - 174 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS9075 6.789 ms 1 - 5 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS8450 (Proxy) 8.623 ms 0 - 202 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® 7 Gen 4 Mobile 6.892 ms 0 - 182 MB NPU
Segformer-Base QNN_DLC w8a8 Snapdragon® 8 Elite For Galaxy Mobile 2.834 ms 1 - 175 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCM6690 38.002 ms 1 - 191 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® SA8295P 7.794 ms 0 - 173 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® SA7255P 10.853 ms 1 - 175 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS7790 6.892 ms 0 - 182 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS8750 2.834 ms 1 - 175 MB NPU
Segformer-Base QNN_DLC w8a8 Qualcomm® QCS7181 6.367 ms 1 - 1 MB NPU
Segformer-Base TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 13.649 ms 8 - 201 MB NPU
Segformer-Base TFLITE float Snapdragon® 8 Gen 3 Mobile 16.462 ms 8 - 237 MB NPU
Segformer-Base TFLITE float Qualcomm® QCS8275 48.896 ms 9 - 199 MB NPU
Segformer-Base TFLITE float Qualcomm® QCS8550 (Proxy) 22.709 ms 9 - 11 MB NPU
Segformer-Base TFLITE float Qualcomm® SA8775P 24.5 ms 10 - 201 MB NPU
Segformer-Base TFLITE float Qualcomm® SA8650P 24.5 ms 10 - 201 MB NPU
Segformer-Base TFLITE float Qualcomm® SA8255P 24.5 ms 10 - 201 MB NPU
Segformer-Base TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 13.22 ms 8 - 203 MB NPU
Segformer-Base TFLITE float Qualcomm® QCS8450 (Proxy) 29.344 ms 10 - 236 MB NPU
Segformer-Base TFLITE float Qualcomm® SA7255P 48.896 ms 9 - 199 MB NPU
Segformer-Base TFLITE float Qualcomm® QCS9075 27.594 ms 8 - 31 MB NPU
Segformer-Base TFLITE float Qualcomm® SA8295P 27.815 ms 9 - 202 MB NPU
Segformer-Base TFLITE float Qualcomm® QCS8750 13.22 ms 8 - 203 MB NPU
Segformer-Base TFLITE w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.528 ms 2 - 189 MB NPU
Segformer-Base TFLITE w8a8 Snapdragon® 8 Gen 3 Mobile 7.096 ms 2 - 210 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS6490 122.727 ms 15 - 50 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS8275 18.256 ms 2 - 178 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS8550 (Proxy) 10.243 ms 2 - 11 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® SA8775P 10.897 ms 2 - 183 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® SA8650P 10.897 ms 2 - 183 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® SA8255P 10.897 ms 2 - 183 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS9075 10.632 ms 2 - 12 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS8450 (Proxy) 12.042 ms 2 - 210 MB NPU
Segformer-Base TFLITE w8a8 Snapdragon® 7 Gen 4 Mobile 38.985 ms 15 - 73 MB NPU
Segformer-Base TFLITE w8a8 Snapdragon® 8 Elite For Galaxy Mobile 5.724 ms 1 - 185 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCM6690 94.028 ms 15 - 186 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® SA8295P 12.873 ms 2 - 187 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® SA7255P 18.256 ms 2 - 178 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS7790 38.985 ms 15 - 73 MB NPU
Segformer-Base TFLITE w8a8 Qualcomm® QCS8750 5.724 ms 1 - 185 MB NPU

License

  • The license for the original implementation of Segformer-Base can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for qualcomm/Segformer-Base