ash12321 commited on
Commit
75aaabf
·
verified ·
1 Parent(s): e8d4049

Upload AI Image Detector model

Browse files
Files changed (4) hide show
  1. README.md +229 -0
  2. inference.py +55 -0
  3. model.ckpt +3 -0
  4. requirements.txt +6 -0
README.md ADDED
@@ -0,0 +1,229 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - computer-vision
5
+ - anomaly-detection
6
+ - deep-svdd
7
+ - ai-generated-images
8
+ - image-classification
9
+ - pytorch-lightning
10
+ datasets:
11
+ - cifar10
12
+ library_name: pytorch-lightning
13
+ pipeline_tag: image-classification
14
+ ---
15
+
16
+ # 🔍 AI Image Detector - Deep SVDD
17
+
18
+ <div align="center">
19
+
20
+ **One-Class Deep Learning Model for Detecting AI-Generated Images**
21
+
22
+ [![Model](https://img.shields.io/badge/Model-Deep%20SVDD-blue)](https://huggingface.co/ash12321/ai-image-detector-deepsvdd)
23
+ [![Framework](https://img.shields.io/badge/Framework-PyTorch%20Lightning-red)](https://lightning.ai/)
24
+ [![Dataset](https://img.shields.io/badge/Dataset-CIFAR--10-green)](https://www.cs.toronto.edu/~kriz/cifar.html)
25
+
26
+ </div>
27
+
28
+ ## 📖 Model Description
29
+
30
+ This model detects AI-generated images using **Deep Support Vector Data Description (SVDD)**, a one-class learning approach. It was trained exclusively on real images to learn what "real" looks like, allowing it to identify synthetic/AI-generated images as anomalies.
31
+
32
+ ### Key Features
33
+
34
+ - ✅ **Enhanced Deep SVDD Architecture** with channel attention mechanisms
35
+ - ✅ **Trained on 35,000 real images** from CIFAR-10 dataset
36
+ - ✅ **L4 GPU Optimized** with mixed precision training (16-bit)
37
+ - ✅ **Advanced Augmentation**: Mixup, multi-scale, contrastive learning
38
+ - ✅ **Robust Evaluation**: 70/15/15 train/val/test split with unseen test data
39
+
40
+ ## 🎯 Performance Metrics
41
+
42
+ | Metric | Value |
43
+ |--------|-------|
44
+ | **Test Loss** | 0.7637 |
45
+ | **Mean Distance** | 0.7637 |
46
+ | **Std Distance** | 0.0024 |
47
+ | **95th Percentile** | 0.7700 |
48
+ | **Radius Threshold** | 0.7747 |
49
+
50
+ ## 🚀 Quick Start
51
+
52
+ ### Installation
53
+
54
+ ```bash
55
+ pip install torch torchvision pytorch-lightning huggingface-hub pillow
56
+ ```
57
+
58
+ ### Basic Usage
59
+
60
+ ```python
61
+ import torch
62
+ from huggingface_hub import hf_hub_download
63
+ from PIL import Image
64
+ import torchvision.transforms as transforms
65
+
66
+ # Download model
67
+ model_path = hf_hub_download(
68
+ repo_id="ash12321/ai-image-detector-deepsvdd",
69
+ filename="model.ckpt"
70
+ )
71
+
72
+ # Load model (you'll need the model class definition)
73
+ from model import AdvancedDeepSVDD
74
+
75
+ model = AdvancedDeepSVDD.load_from_checkpoint(model_path)
76
+ model.eval()
77
+
78
+ # Prepare image
79
+ transform = transforms.Compose([
80
+ transforms.Resize((32, 32)),
81
+ transforms.ToTensor(),
82
+ transforms.Normalize(
83
+ mean=[0.4914, 0.4822, 0.4465],
84
+ std=[0.2470, 0.2435, 0.2616]
85
+ )
86
+ ])
87
+
88
+ image = Image.open('test_image.jpg').convert('RGB')
89
+ image_tensor = transform(image).unsqueeze(0)
90
+
91
+ # Predict
92
+ is_fake, scores, distances = model.predict_anomaly(image_tensor)
93
+
94
+ print(f"AI-Generated: {is_fake[0].item()}")
95
+ print(f"Confidence: {scores[0].item()*100:.1f}%")
96
+ print(f"Anomaly Score: {scores[0].item():.4f}")
97
+ ```
98
+
99
+ ### Using with Gradio
100
+
101
+ ```python
102
+ import gradio as gr
103
+
104
+ def predict(image):
105
+ img_tensor = transform(image).unsqueeze(0)
106
+ is_fake, scores, _ = model.predict_anomaly(img_tensor)
107
+
108
+ result = "🚨 AI-Generated" if is_fake[0] else "✅ Real Image"
109
+ confidence = f"{scores[0].item()*100:.1f}%"
110
+
111
+ return f"**{result}** (Confidence: {confidence})"
112
+
113
+ demo = gr.Interface(
114
+ fn=predict,
115
+ inputs=gr.Image(type="pil"),
116
+ outputs=gr.Markdown(),
117
+ title="AI Image Detector"
118
+ )
119
+
120
+ demo.launch()
121
+ ```
122
+
123
+ ## 🏗️ Architecture Details
124
+
125
+ ### Enhanced Deep SVDD Encoder
126
+
127
+ ```
128
+ Input (3x32x32)
129
+ → Stem Conv (64 channels)
130
+ → Layer1 (64→128) + Channel Attention
131
+ → Layer2 (128→256) + Channel Attention
132
+ → Layer3 (256→512) + Channel Attention
133
+ → Dual Pooling (Avg + Max)
134
+ → Projection Head (1024→512→128)
135
+ → Output (128-dim latent space)
136
+ ```
137
+
138
+ ### Training Optimizations
139
+
140
+ - **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-3)
141
+ - **Scheduler**: OneCycleLR with cosine annealing
142
+ - **Batch Size**: 128 (L4 GPU optimized)
143
+ - **Augmentation**: Mixup (α=0.2), multi-scale, extensive transforms
144
+ - **Loss**: SVDD objective + contrastive diversity + L2 regularization
145
+
146
+ ## 📊 Training Configuration
147
+
148
+ ```python
149
+ Model Parameters: 5.3M trainable
150
+ Epochs: 30
151
+ Training Samples: 35,000 (70%)
152
+ Validation Samples: 7,500 (15%)
153
+ Test Samples: 7,500 (15%)
154
+ Precision: 16-bit mixed precision
155
+ GPU: NVIDIA L4 with Tensor Cores
156
+ ```
157
+
158
+ ## 🎨 Data Augmentation Pipeline
159
+
160
+ **Training Augmentations:**
161
+ - Multi-scale resizing (32, 64, 96 pixels)
162
+ - Random resized crop (scale: 0.5-1.0)
163
+ - Random horizontal/vertical flips
164
+ - Random rotation (±20°)
165
+ - Color jitter (brightness, contrast, saturation, hue)
166
+ - Gaussian blur
167
+ - Random erasing
168
+ - Mixup augmentation
169
+
170
+ **Validation/Test:**
171
+ - Simple resize to 32x32
172
+ - Normalize with CIFAR-10 statistics
173
+
174
+ ## 💡 Use Cases
175
+
176
+ - **Content Moderation**: Identify AI-generated images in uploads
177
+ - **Digital Forensics**: Verify authenticity of images
178
+ - **Research**: Study differences between real and synthetic images
179
+ - **Education**: Demonstrate one-class learning techniques
180
+
181
+ ## ⚠️ Limitations
182
+
183
+ - **Training Domain**: Optimized for natural images similar to CIFAR-10
184
+ - **Image Size**: Trained on 32x32 images (resize larger images)
185
+ - **Generalization**: May require fine-tuning for specific domains
186
+ - **False Positives**: Unusual real images may be flagged as AI-generated
187
+ - **Not Foolproof**: Sophisticated AI images may evade detection
188
+
189
+ ## 📚 Citation
190
+
191
+ If you use this model in your research, please cite:
192
+
193
+ ```bibtex
194
+ @misc{ai-image-detector-deepsvdd-2024,
195
+ author = {ash12321},
196
+ title = {AI Image Detector using Deep SVDD},
197
+ year = {2024},
198
+ publisher = {Hugging Face},
199
+ howpublished = {\url{https://huggingface.co/ash12321/ai-image-detector-deepsvdd}},
200
+ }
201
+ ```
202
+
203
+ ## 📄 License
204
+
205
+ This model is released under the MIT License.
206
+
207
+ ## 🤝 Contributing
208
+
209
+ Contributions, issues, and feature requests are welcome!
210
+
211
+ ## 👤 Author
212
+
213
+ **ash12321**
214
+ - Hugging Face: [@ash12321](https://huggingface.co/ash12321)
215
+
216
+ ## 🙏 Acknowledgments
217
+
218
+ - CIFAR-10 dataset creators
219
+ - PyTorch Lightning team
220
+ - Deep SVDD paper authors
221
+ - Hugging Face for hosting infrastructure
222
+
223
+ ---
224
+
225
+ <div align="center">
226
+
227
+ **[Try it on Hugging Face Spaces](https://huggingface.co/spaces/ash12321/ai-image-detector-demo)**
228
+
229
+ </div>
inference.py ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from huggingface_hub import hf_hub_download
3
+ from PIL import Image
4
+ import torchvision.transforms as transforms
5
+
6
+ # You'll need to include your model definition
7
+ # Copy the AdvancedDeepSVDD class and related code here
8
+ # or import from your training script
9
+
10
+ def load_model(repo_id="ash12321/ai-image-detector-deepsvdd"):
11
+ """Download and load model from HuggingFace"""
12
+
13
+ model_path = hf_hub_download(
14
+ repo_id=repo_id,
15
+ filename="model.ckpt"
16
+ )
17
+
18
+ # Load model (requires model definition)
19
+ from model import AdvancedDeepSVDD
20
+ model = AdvancedDeepSVDD.load_from_checkpoint(model_path)
21
+ model.eval()
22
+
23
+ return model
24
+
25
+ def predict_image(image_path, model):
26
+ """Predict if image is AI-generated"""
27
+
28
+ transform = transforms.Compose([
29
+ transforms.Resize((32, 32)),
30
+ transforms.ToTensor(),
31
+ transforms.Normalize(
32
+ mean=[0.4914, 0.4822, 0.4465],
33
+ std=[0.2470, 0.2435, 0.2616]
34
+ )
35
+ ])
36
+
37
+ image = Image.open(image_path).convert('RGB')
38
+ image_tensor = transform(image).unsqueeze(0)
39
+
40
+ with torch.no_grad():
41
+ is_fake, scores, distances = model.predict_anomaly(image_tensor)
42
+
43
+ return {
44
+ 'is_ai_generated': bool(is_fake[0].item()),
45
+ 'confidence': float(scores[0].item()),
46
+ 'anomaly_score': float(scores[0].item()),
47
+ 'distance': float(distances[0].item())
48
+ }
49
+
50
+ # Example usage
51
+ if __name__ == "__main__":
52
+ model = load_model()
53
+ result = predict_image("test_image.jpg", model)
54
+ print(f"AI-Generated: {result['is_ai_generated']}")
55
+ print(f"Confidence: {result['confidence']*100:.1f}%")
model.ckpt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:613c522b1c13bb5621bf26b1e48e2b7c1f0fa8a4af5e013d8546b0cecaf2070f
3
+ size 64014947
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ torch>=2.0.0
2
+ torchvision>=0.15.0
3
+ pytorch-lightning>=2.0.0
4
+ numpy>=1.24.0
5
+ Pillow>=9.5.0
6
+ huggingface-hub>=0.16.0