Samid drone detector

Audio Spectrogram Transformer fine-tuned for binary acoustic drone detection.

Quick use

from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
import torch, soundfile as sf

model = AutoModelForAudioClassification.from_pretrained("Rashidbm/samid-drone-detector")
fe = AutoFeatureExtractor.from_pretrained("Rashidbm/samid-drone-detector")

audio, sr = sf.read("clip.wav")
inputs = fe(audio, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
print(f"p(drone) = {torch.softmax(logits, dim=-1)[0, 1].item():.4f}")

Training

  • Backbone: MIT/ast-finetuned-audioset-10-10-0.4593
  • Datasets: geronimobasso/drone-audio-detection-samples, ahlab-drone-project/DroneAudioSet (splits 1-20)
  • Augmentations applied symmetrically across both classes: codec round-trip, synthetic RIR, random EQ, FilterAugment, Patchout, SpecAugment, Mixup. Asymmetric: urban-noise overlay onto drone class.

Performance

Test Result
NUS DroneAudioSet held-out (48 clips) 100% detection
Geronimobasso sanity (50 random clips) 24/25 drone, 25/25 no-drone

Recommended inference

For long audio, slide a 1-second window with 0.5s hop, apply a median filter to per-window probabilities, and require N consecutive windows above threshold. See scripts/standalone_inference.py in the repo.

Trained on 1.0s windows at 16 kHz mono. May fire on rotor-like sounds.

Downloads last month
323
Safetensors
Model size
86.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Rashidbm/samid-drone-detector 1