NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution

This is the model for NovaSR, a tiny 50kb audio upsampling model that upscales muffled 16khz audio into clear and crisp 48khz audio at speeds from 100-3500x realtime.

Details

  • Model Size: 52kb for pytorch version
  • Input Rate: 16kHz
  • Output Rate: 48kHz
  • Inference Speed: 300-3500x realtime depending on gpu

Comparisons

Comparisons were done on A100 gpu. Higher realtime means faster processing speeds. Comparison on CPU are coming soon.

Model Speed (Real-Time) Model Size
NovaSR 3600x realtime ~52 KB
FlowHigh 20x realtime ~450 MB
FlashSR 14x realtime ~1000 MB
AudioSR 0.6x realtime ~2000 MB

Usage

Please check out the github repo for usage: https://github.com/ysharma3501/NovaSR

If you find the model/code helpful, stars or likes would be appreciated. Thank you.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support