| | --- |
| | license: mit |
| | tags: |
| | - computer-vision |
| | - anomaly-detection |
| | - deep-svdd |
| | - ai-generated-images |
| | - image-classification |
| | - pytorch-lightning |
| | datasets: |
| | - cifar10 |
| | library_name: pytorch-lightning |
| | pipeline_tag: image-classification |
| | --- |
| | |
| | # π AI Image Detector - Deep SVDD |
| |
|
| | <div align="center"> |
| |
|
| | **One-Class Deep Learning Model for Detecting AI-Generated Images** |
| |
|
| | [](https://huggingface.co/ash12321/ai-image-detector-deepsvdd) |
| | [](https://lightning.ai/) |
| | [](https://www.cs.toronto.edu/~kriz/cifar.html) |
| |
|
| | </div> |
| |
|
| | ## π Model Description |
| |
|
| | This model detects AI-generated images using **Deep Support Vector Data Description (SVDD)**, a one-class learning approach. It was trained exclusively on real images to learn what "real" looks like, allowing it to identify synthetic/AI-generated images as anomalies. |
| |
|
| | ### Key Features |
| |
|
| | - β
**Enhanced Deep SVDD Architecture** with channel attention mechanisms |
| | - β
**Trained on 35,000 real images** from CIFAR-10 dataset |
| | - β
**L4 GPU Optimized** with mixed precision training (16-bit) |
| | - β
**Advanced Augmentation**: Mixup, multi-scale, contrastive learning |
| | - β
**Robust Evaluation**: 70/15/15 train/val/test split with unseen test data |
| |
|
| | ## π― Performance Metrics |
| |
|
| | | Metric | Value | |
| | |--------|-------| |
| | | **Test Loss** | 0.7637 | |
| | | **Mean Distance** | 0.7637 | |
| | | **Std Distance** | 0.0024 | |
| | | **95th Percentile** | 0.7700 | |
| | | **Radius Threshold** | 0.7747 | |
| |
|
| | ## π Quick Start |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install torch torchvision pytorch-lightning huggingface-hub pillow |
| | ``` |
| |
|
| | ### Basic Usage |
| |
|
| | ```python |
| | import torch |
| | from huggingface_hub import hf_hub_download |
| | from PIL import Image |
| | import torchvision.transforms as transforms |
| | |
| | # Download model |
| | model_path = hf_hub_download( |
| | repo_id="ash12321/ai-image-detector-deepsvdd", |
| | filename="model.ckpt" |
| | ) |
| | |
| | # Load model (you'll need the model class definition) |
| | from model import AdvancedDeepSVDD |
| | |
| | model = AdvancedDeepSVDD.load_from_checkpoint(model_path) |
| | model.eval() |
| | |
| | # Prepare image |
| | transform = transforms.Compose([ |
| | transforms.Resize((32, 32)), |
| | transforms.ToTensor(), |
| | transforms.Normalize( |
| | mean=[0.4914, 0.4822, 0.4465], |
| | std=[0.2470, 0.2435, 0.2616] |
| | ) |
| | ]) |
| | |
| | image = Image.open('test_image.jpg').convert('RGB') |
| | image_tensor = transform(image).unsqueeze(0) |
| | |
| | # Predict |
| | is_fake, scores, distances = model.predict_anomaly(image_tensor) |
| | |
| | print(f"AI-Generated: {is_fake[0].item()}") |
| | print(f"Confidence: {scores[0].item()*100:.1f}%") |
| | print(f"Anomaly Score: {scores[0].item():.4f}") |
| | ``` |
| |
|
| | ### Using with Gradio |
| |
|
| | ```python |
| | import gradio as gr |
| | |
| | def predict(image): |
| | img_tensor = transform(image).unsqueeze(0) |
| | is_fake, scores, _ = model.predict_anomaly(img_tensor) |
| | |
| | result = "π¨ AI-Generated" if is_fake[0] else "β
Real Image" |
| | confidence = f"{scores[0].item()*100:.1f}%" |
| | |
| | return f"**{result}** (Confidence: {confidence})" |
| | |
| | demo = gr.Interface( |
| | fn=predict, |
| | inputs=gr.Image(type="pil"), |
| | outputs=gr.Markdown(), |
| | title="AI Image Detector" |
| | ) |
| | |
| | demo.launch() |
| | ``` |
| |
|
| | ## ποΈ Architecture Details |
| |
|
| | ### Enhanced Deep SVDD Encoder |
| |
|
| | ``` |
| | Input (3x32x32) |
| | β Stem Conv (64 channels) |
| | β Layer1 (64β128) + Channel Attention |
| | β Layer2 (128β256) + Channel Attention |
| | β Layer3 (256β512) + Channel Attention |
| | β Dual Pooling (Avg + Max) |
| | β Projection Head (1024β512β128) |
| | β Output (128-dim latent space) |
| | ``` |
| |
|
| | ### Training Optimizations |
| |
|
| | - **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-3) |
| | - **Scheduler**: OneCycleLR with cosine annealing |
| | - **Batch Size**: 128 (L4 GPU optimized) |
| | - **Augmentation**: Mixup (Ξ±=0.2), multi-scale, extensive transforms |
| | - **Loss**: SVDD objective + contrastive diversity + L2 regularization |
| | |
| | ## π Training Configuration |
| | |
| | ```python |
| | Model Parameters: 5.3M trainable |
| | Epochs: 30 |
| | Training Samples: 35,000 (70%) |
| | Validation Samples: 7,500 (15%) |
| | Test Samples: 7,500 (15%) |
| | Precision: 16-bit mixed precision |
| | GPU: NVIDIA L4 with Tensor Cores |
| | ``` |
| | |
| | ## π¨ Data Augmentation Pipeline |
| | |
| | **Training Augmentations:** |
| | - Multi-scale resizing (32, 64, 96 pixels) |
| | - Random resized crop (scale: 0.5-1.0) |
| | - Random horizontal/vertical flips |
| | - Random rotation (Β±20Β°) |
| | - Color jitter (brightness, contrast, saturation, hue) |
| | - Gaussian blur |
| | - Random erasing |
| | - Mixup augmentation |
| | |
| | **Validation/Test:** |
| | - Simple resize to 32x32 |
| | - Normalize with CIFAR-10 statistics |
| | |
| | ## π‘ Use Cases |
| | |
| | - **Content Moderation**: Identify AI-generated images in uploads |
| | - **Digital Forensics**: Verify authenticity of images |
| | - **Research**: Study differences between real and synthetic images |
| | - **Education**: Demonstrate one-class learning techniques |
| | |
| | ## β οΈ Limitations |
| | |
| | - **Training Domain**: Optimized for natural images similar to CIFAR-10 |
| | - **Image Size**: Trained on 32x32 images (resize larger images) |
| | - **Generalization**: May require fine-tuning for specific domains |
| | - **False Positives**: Unusual real images may be flagged as AI-generated |
| | - **Not Foolproof**: Sophisticated AI images may evade detection |
| | |
| | ## π Citation |
| | |
| | If you use this model in your research, please cite: |
| | |
| | ```bibtex |
| | @misc{ai-image-detector-deepsvdd-2024, |
| | author = {ash12321}, |
| | title = {AI Image Detector using Deep SVDD}, |
| | year = {2024}, |
| | publisher = {Hugging Face}, |
| | howpublished = {\url{https://huggingface.co/ash12321/ai-image-detector-deepsvdd}}, |
| | } |
| | ``` |
| | |
| | ## π License |
| | |
| | This model is released under the MIT License. |
| | |
| | ## π€ Contributing |
| | |
| | Contributions, issues, and feature requests are welcome! |
| | |
| | ## π€ Author |
| | |
| | **ash12321** |
| | - Hugging Face: [@ash12321](https://huggingface.co/ash12321) |
| | |
| | ## π Acknowledgments |
| | |
| | - CIFAR-10 dataset creators |
| | - PyTorch Lightning team |
| | - Deep SVDD paper authors |
| | - Hugging Face for hosting infrastructure |
| | |
| | --- |
| | |
| | <div align="center"> |
| | |
| | **[Try it on Hugging Face Spaces](https://huggingface.co/spaces/ash12321/ai-image-detector-demo)** |
| | |
| | </div> |
| | |