NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution

This is the model for NovaSR, a tiny 50kB audio upsampling model that upscales muffled 16khz audio into clear and crisp 48khz audio at speeds from 100-3500x realtime.

Details

Model Size: 52kB for pytorch version
Input Rate: 16kHz
Output Rate: 48kHz
Inference Speed: 300-3500x realtime depending on gpu

Comparisons

Comparisons were done on A100 gpu. Higher realtime means faster processing speeds. Comparison on CPU are coming soon.

Model	Speed (Real-Time)	Model Size
NovaSR	3600x realtime	~52 KB
FlowHigh	20x realtime	~450 MB
FlashSR	14x realtime	~1000 MB
AudioSR	0.6x realtime	~2000 MB

Examples

Random 3s examples from datasets

Before:

After:

Before:

After:

Before(music):

After(music):

Usage

Please check out the github repo for usage: https://github.com/ysharma3501/NovaSR

You can also try it on spaces: https://huggingface.co/spaces/YatharthS/NovaSR

If you find the model/code helpful, stars or likes would be appreciated. Thank you.

Downloads last month: 427

YatharthS
/

NovaSR

NovaSR: Pushing the Limits of Extreme Efficiency in Audio Super-Resolution

Details

Comparisons

Examples

Usage

Spaces using YatharthS/NovaSR 6