Segformer-Base: Optimized for Qualcomm Devices
Segformer Base is a machine learning model that predicts masks and classes of objects in an image.
This is based on the implementation of Segformer-Base found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.
Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.
Getting Started
There are two ways to deploy this model on your device:
Option 1: Download Pre-Exported Models
Below are pre-exported model assets ready for deployment.
| Runtime | Precision | Chipset | SDK Versions | Download |
|---|---|---|---|---|
| ONNX | float | Universal | QAIRT 2.42, ONNX Runtime 1.25.0 | Download |
| ONNX | w8a16 | Universal | QAIRT 2.42, ONNX Runtime 1.25.0 | Download |
| ONNX | w8a8 | Universal | QAIRT 2.42, ONNX Runtime 1.25.0 | Download |
| QNN_DLC | float | Universal | QAIRT 2.45 | Download |
| QNN_DLC | w8a16 | Universal | QAIRT 2.45 | Download |
| QNN_DLC | w8a8 | Universal | QAIRT 2.45 | Download |
| TFLITE | float | Universal | QAIRT 2.45 | Download |
| TFLITE | w8a8 | Universal | QAIRT 2.45 | Download |
For more device-specific assets and performance metrics, visit Segformer-Base on Qualcomm® AI Hub.
Option 2: Export with Custom Configurations
Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:
- Custom weights (e.g., fine-tuned checkpoints)
- Custom input shapes
- Target device and runtime configurations
This option is ideal if you need to customize the model beyond the default configuration provided here.
See our repository for Segformer-Base on GitHub for usage instructions.
Model Details
Model Type: Model_use_case.semantic_segmentation
Model Stats:
- Model checkpoint: nvidia/segformer-b0-finetuned-ade-512-512
- Input resolution: 512x512
- Number of output classes: 150
- Number of parameters: 3.75M
- Model size (float): 14.4 MB
- Model size (w8a16): 4.57 MB
- Model size (w8a8): 3.90 MB
Performance Summary
| Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |
|---|---|---|---|---|---|---|
| Segformer-Base | ONNX | float | Snapdragon® 8 Elite Gen 5 Mobile | 74.253 ms | 31 - 219 MB | NPU |
| Segformer-Base | ONNX | float | Snapdragon® X2 Elite | 72.678 ms | 209 - 209 MB | NPU |
| Segformer-Base | ONNX | float | Snapdragon® X Elite | 112.391 ms | 178 - 178 MB | NPU |
| Segformer-Base | ONNX | float | Snapdragon® 8 Gen 3 Mobile | 82.181 ms | 14 - 240 MB | NPU |
| Segformer-Base | ONNX | float | Qualcomm® QCS8550 (Proxy) | 108.166 ms | 19 - 170 MB | NPU |
| Segformer-Base | ONNX | float | Snapdragon® 8 Elite For Galaxy Mobile | 74.209 ms | 23 - 212 MB | NPU |
| Segformer-Base | ONNX | float | Qualcomm® QCS9075 | 113.543 ms | 20 - 70 MB | NPU |
| Segformer-Base | ONNX | float | Qualcomm® QCS8750 | 74.209 ms | 23 - 212 MB | NPU |
| Segformer-Base | ONNX | float | Qualcomm® QCS7181 | 112.391 ms | 178 - 178 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 6.811 ms | 13 - 219 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Snapdragon® X2 Elite | 6.873 ms | 211 - 211 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Snapdragon® 8 Gen 3 Mobile | 10.442 ms | 13 - 252 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCS6490 | 722.622 ms | 379 - 385 MB | CPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCS8550 (Proxy) | 14.943 ms | 10 - 210 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Snapdragon® 7 Gen 4 Mobile | 317.437 ms | 380 - 391 MB | CPU |
| Segformer-Base | ONNX | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 8.597 ms | 13 - 216 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCM6690 | 350.577 ms | 324 - 336 MB | CPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCS9075 | 20.456 ms | 9 - 58 MB | NPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCS7790 | 317.437 ms | 380 - 391 MB | CPU |
| Segformer-Base | ONNX | w8a16 | Qualcomm® QCS8750 | 8.597 ms | 13 - 216 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 4.578 ms | 7 - 205 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Snapdragon® X2 Elite | 4.564 ms | 212 - 212 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Snapdragon® 8 Gen 3 Mobile | 7.595 ms | 1 - 223 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCS6490 | 273.483 ms | 194 - 202 MB | CPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCS8550 (Proxy) | 11.022 ms | 2 - 50 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCS9075 | 12.132 ms | 4 - 53 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Snapdragon® 7 Gen 4 Mobile | 156.934 ms | 127 - 140 MB | CPU |
| Segformer-Base | ONNX | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 5.568 ms | 9 - 201 MB | NPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCM6690 | 174.045 ms | 191 - 204 MB | CPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCS7790 | 156.934 ms | 127 - 140 MB | CPU |
| Segformer-Base | ONNX | w8a8 | Qualcomm® QCS8750 | 5.568 ms | 9 - 201 MB | NPU |
| Segformer-Base | QNN_DLC | float | Snapdragon® 8 Elite Gen 5 Mobile | 13.657 ms | 3 - 193 MB | NPU |
| Segformer-Base | QNN_DLC | float | Snapdragon® X2 Elite | 12.519 ms | 3 - 3 MB | NPU |
| Segformer-Base | QNN_DLC | float | Snapdragon® X Elite | 23.533 ms | 3 - 3 MB | NPU |
| Segformer-Base | QNN_DLC | float | Snapdragon® 8 Gen 3 Mobile | 16.364 ms | 0 - 223 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS8275 | 48.898 ms | 1 - 191 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS8550 (Proxy) | 22.689 ms | 3 - 18 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® SA8775P | 24.446 ms | 1 - 193 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® SA8650P | 24.446 ms | 1 - 193 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® SA8255P | 24.446 ms | 1 - 193 MB | NPU |
| Segformer-Base | QNN_DLC | float | Snapdragon® 8 Elite For Galaxy Mobile | 13.158 ms | 0 - 198 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS8450 (Proxy) | 29.286 ms | 3 - 227 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® SA7255P | 48.898 ms | 1 - 191 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS9075 | 27.458 ms | 3 - 17 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® SA8295P | 27.861 ms | 0 - 187 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS8750 | 13.158 ms | 0 - 198 MB | NPU |
| Segformer-Base | QNN_DLC | float | Qualcomm® QCS7181 | 23.533 ms | 3 - 3 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® 8 Elite Gen 5 Mobile | 12.678 ms | 2 - 241 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® X2 Elite | 11.609 ms | 2 - 2 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® X Elite | 23.327 ms | 2 - 2 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® 8 Gen 3 Mobile | 19.102 ms | 2 - 271 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS8275 | 37.761 ms | 2 - 231 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS8550 (Proxy) | 25.215 ms | 2 - 304 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® SA8775P | 24.565 ms | 2 - 232 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® SA8650P | 24.565 ms | 2 - 232 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® SA8255P | 24.565 ms | 2 - 232 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® 7 Gen 4 Mobile | 47.606 ms | 2 - 261 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Snapdragon® 8 Elite For Galaxy Mobile | 13.885 ms | 2 - 234 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCM6690 | 120.787 ms | 2 - 293 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® SA7255P | 37.761 ms | 2 - 231 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS9075 | 48.728 ms | 1 - 9 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS7790 | 47.606 ms | 2 - 261 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS8750 | 13.885 ms | 2 - 234 MB | NPU |
| Segformer-Base | QNN_DLC | w8a16 | Qualcomm® QCS7181 | 23.327 ms | 2 - 2 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 2.233 ms | 1 - 182 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® X2 Elite | 2.608 ms | 1 - 1 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® X Elite | 6.367 ms | 1 - 1 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® 8 Gen 3 Mobile | 3.965 ms | 0 - 198 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS6490 | 15.578 ms | 0 - 5 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS8275 | 10.853 ms | 1 - 175 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS8550 (Proxy) | 5.799 ms | 1 - 53 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® SA8775P | 6.413 ms | 1 - 174 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® SA8650P | 6.413 ms | 1 - 174 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® SA8255P | 6.413 ms | 1 - 174 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS9075 | 6.789 ms | 1 - 5 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS8450 (Proxy) | 8.623 ms | 0 - 202 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® 7 Gen 4 Mobile | 6.892 ms | 0 - 182 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 2.834 ms | 1 - 175 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCM6690 | 38.002 ms | 1 - 191 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® SA8295P | 7.794 ms | 0 - 173 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® SA7255P | 10.853 ms | 1 - 175 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS7790 | 6.892 ms | 0 - 182 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS8750 | 2.834 ms | 1 - 175 MB | NPU |
| Segformer-Base | QNN_DLC | w8a8 | Qualcomm® QCS7181 | 6.367 ms | 1 - 1 MB | NPU |
| Segformer-Base | TFLITE | float | Snapdragon® 8 Elite Gen 5 Mobile | 13.649 ms | 8 - 201 MB | NPU |
| Segformer-Base | TFLITE | float | Snapdragon® 8 Gen 3 Mobile | 16.462 ms | 8 - 237 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® QCS8275 | 48.896 ms | 9 - 199 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® QCS8550 (Proxy) | 22.709 ms | 9 - 11 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® SA8775P | 24.5 ms | 10 - 201 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® SA8650P | 24.5 ms | 10 - 201 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® SA8255P | 24.5 ms | 10 - 201 MB | NPU |
| Segformer-Base | TFLITE | float | Snapdragon® 8 Elite For Galaxy Mobile | 13.22 ms | 8 - 203 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® QCS8450 (Proxy) | 29.344 ms | 10 - 236 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® SA7255P | 48.896 ms | 9 - 199 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® QCS9075 | 27.594 ms | 8 - 31 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® SA8295P | 27.815 ms | 9 - 202 MB | NPU |
| Segformer-Base | TFLITE | float | Qualcomm® QCS8750 | 13.22 ms | 8 - 203 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Snapdragon® 8 Elite Gen 5 Mobile | 4.528 ms | 2 - 189 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Snapdragon® 8 Gen 3 Mobile | 7.096 ms | 2 - 210 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS6490 | 122.727 ms | 15 - 50 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS8275 | 18.256 ms | 2 - 178 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS8550 (Proxy) | 10.243 ms | 2 - 11 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® SA8775P | 10.897 ms | 2 - 183 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® SA8650P | 10.897 ms | 2 - 183 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® SA8255P | 10.897 ms | 2 - 183 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS9075 | 10.632 ms | 2 - 12 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS8450 (Proxy) | 12.042 ms | 2 - 210 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Snapdragon® 7 Gen 4 Mobile | 38.985 ms | 15 - 73 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Snapdragon® 8 Elite For Galaxy Mobile | 5.724 ms | 1 - 185 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCM6690 | 94.028 ms | 15 - 186 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® SA8295P | 12.873 ms | 2 - 187 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® SA7255P | 18.256 ms | 2 - 178 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS7790 | 38.985 ms | 15 - 73 MB | NPU |
| Segformer-Base | TFLITE | w8a8 | Qualcomm® QCS8750 | 5.724 ms | 1 - 185 MB | NPU |
License
- The license for the original implementation of Segformer-Base can be found here.
References
- SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
- Source Model Implementation
Community
- Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
- For questions or feedback please reach out to us.
