NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution
This is the model for NovaSR, a tiny 50kb audio upsampling model that upscales muffled 16khz audio into clear and crisp 48khz audio at speeds from 100-3500x realtime.
Details
- Model Size: 52kb for pytorch version
- Input Rate: 16kHz
- Output Rate: 48kHz
- Inference Speed: 300-3500x realtime depending on gpu
Comparisons
Comparisons were done on A100 gpu. Higher realtime means faster processing speeds. Comparison on CPU are coming soon.
| Model | Speed (Real-Time) | Model Size |
|---|---|---|
| NovaSR | 3600x realtime | ~52 KB |
| FlowHigh | 20x realtime | ~450 MB |
| FlashSR | 14x realtime | ~1000 MB |
| AudioSR | 0.6x realtime | ~2000 MB |
Usage
Please check out the github repo for usage: https://github.com/ysharma3501/NovaSR
If you find the model/code helpful, stars or likes would be appreciated. Thank you.