ash12321's picture
Upload AI Image Detector model
75aaabf verified
---
license: mit
tags:
- computer-vision
- anomaly-detection
- deep-svdd
- ai-generated-images
- image-classification
- pytorch-lightning
datasets:
- cifar10
library_name: pytorch-lightning
pipeline_tag: image-classification
---
# πŸ” AI Image Detector - Deep SVDD
<div align="center">
**One-Class Deep Learning Model for Detecting AI-Generated Images**
[![Model](https://img.shields.io/badge/Model-Deep%20SVDD-blue)](https://huggingface.co/ash12321/ai-image-detector-deepsvdd)
[![Framework](https://img.shields.io/badge/Framework-PyTorch%20Lightning-red)](https://lightning.ai/)
[![Dataset](https://img.shields.io/badge/Dataset-CIFAR--10-green)](https://www.cs.toronto.edu/~kriz/cifar.html)
</div>
## πŸ“– Model Description
This model detects AI-generated images using **Deep Support Vector Data Description (SVDD)**, a one-class learning approach. It was trained exclusively on real images to learn what "real" looks like, allowing it to identify synthetic/AI-generated images as anomalies.
### Key Features
- βœ… **Enhanced Deep SVDD Architecture** with channel attention mechanisms
- βœ… **Trained on 35,000 real images** from CIFAR-10 dataset
- βœ… **L4 GPU Optimized** with mixed precision training (16-bit)
- βœ… **Advanced Augmentation**: Mixup, multi-scale, contrastive learning
- βœ… **Robust Evaluation**: 70/15/15 train/val/test split with unseen test data
## 🎯 Performance Metrics
| Metric | Value |
|--------|-------|
| **Test Loss** | 0.7637 |
| **Mean Distance** | 0.7637 |
| **Std Distance** | 0.0024 |
| **95th Percentile** | 0.7700 |
| **Radius Threshold** | 0.7747 |
## πŸš€ Quick Start
### Installation
```bash
pip install torch torchvision pytorch-lightning huggingface-hub pillow
```
### Basic Usage
```python
import torch
from huggingface_hub import hf_hub_download
from PIL import Image
import torchvision.transforms as transforms
# Download model
model_path = hf_hub_download(
repo_id="ash12321/ai-image-detector-deepsvdd",
filename="model.ckpt"
)
# Load model (you'll need the model class definition)
from model import AdvancedDeepSVDD
model = AdvancedDeepSVDD.load_from_checkpoint(model_path)
model.eval()
# Prepare image
transform = transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.4914, 0.4822, 0.4465],
std=[0.2470, 0.2435, 0.2616]
)
])
image = Image.open('test_image.jpg').convert('RGB')
image_tensor = transform(image).unsqueeze(0)
# Predict
is_fake, scores, distances = model.predict_anomaly(image_tensor)
print(f"AI-Generated: {is_fake[0].item()}")
print(f"Confidence: {scores[0].item()*100:.1f}%")
print(f"Anomaly Score: {scores[0].item():.4f}")
```
### Using with Gradio
```python
import gradio as gr
def predict(image):
img_tensor = transform(image).unsqueeze(0)
is_fake, scores, _ = model.predict_anomaly(img_tensor)
result = "🚨 AI-Generated" if is_fake[0] else "βœ… Real Image"
confidence = f"{scores[0].item()*100:.1f}%"
return f"**{result}** (Confidence: {confidence})"
demo = gr.Interface(
fn=predict,
inputs=gr.Image(type="pil"),
outputs=gr.Markdown(),
title="AI Image Detector"
)
demo.launch()
```
## πŸ—οΈ Architecture Details
### Enhanced Deep SVDD Encoder
```
Input (3x32x32)
β†’ Stem Conv (64 channels)
β†’ Layer1 (64β†’128) + Channel Attention
β†’ Layer2 (128β†’256) + Channel Attention
β†’ Layer3 (256β†’512) + Channel Attention
β†’ Dual Pooling (Avg + Max)
β†’ Projection Head (1024β†’512β†’128)
β†’ Output (128-dim latent space)
```
### Training Optimizations
- **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-3)
- **Scheduler**: OneCycleLR with cosine annealing
- **Batch Size**: 128 (L4 GPU optimized)
- **Augmentation**: Mixup (Ξ±=0.2), multi-scale, extensive transforms
- **Loss**: SVDD objective + contrastive diversity + L2 regularization
## πŸ“Š Training Configuration
```python
Model Parameters: 5.3M trainable
Epochs: 30
Training Samples: 35,000 (70%)
Validation Samples: 7,500 (15%)
Test Samples: 7,500 (15%)
Precision: 16-bit mixed precision
GPU: NVIDIA L4 with Tensor Cores
```
## 🎨 Data Augmentation Pipeline
**Training Augmentations:**
- Multi-scale resizing (32, 64, 96 pixels)
- Random resized crop (scale: 0.5-1.0)
- Random horizontal/vertical flips
- Random rotation (Β±20Β°)
- Color jitter (brightness, contrast, saturation, hue)
- Gaussian blur
- Random erasing
- Mixup augmentation
**Validation/Test:**
- Simple resize to 32x32
- Normalize with CIFAR-10 statistics
## πŸ’‘ Use Cases
- **Content Moderation**: Identify AI-generated images in uploads
- **Digital Forensics**: Verify authenticity of images
- **Research**: Study differences between real and synthetic images
- **Education**: Demonstrate one-class learning techniques
## ⚠️ Limitations
- **Training Domain**: Optimized for natural images similar to CIFAR-10
- **Image Size**: Trained on 32x32 images (resize larger images)
- **Generalization**: May require fine-tuning for specific domains
- **False Positives**: Unusual real images may be flagged as AI-generated
- **Not Foolproof**: Sophisticated AI images may evade detection
## πŸ“š Citation
If you use this model in your research, please cite:
```bibtex
@misc{ai-image-detector-deepsvdd-2024,
author = {ash12321},
title = {AI Image Detector using Deep SVDD},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/ash12321/ai-image-detector-deepsvdd}},
}
```
## πŸ“„ License
This model is released under the MIT License.
## 🀝 Contributing
Contributions, issues, and feature requests are welcome!
## πŸ‘€ Author
**ash12321**
- Hugging Face: [@ash12321](https://huggingface.co/ash12321)
## πŸ™ Acknowledgments
- CIFAR-10 dataset creators
- PyTorch Lightning team
- Deep SVDD paper authors
- Hugging Face for hosting infrastructure
---
<div align="center">
**[Try it on Hugging Face Spaces](https://huggingface.co/spaces/ash12321/ai-image-detector-demo)**
</div>