File size: 6,328 Bytes
e9ea0c8 54d8cd4 164c490 54d8cd4 4350516 54d8cd4 31aedae 5b327d7 fddd53b 31aedae 3502533 31aedae 61c1fc0 8aee03d 61c1fc0 e589e40 61c1fc0 cf00a89 31aedae c67db0f 31aedae |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
---
license: mit
datasets:
- Parveshiiii/AI-vs-Real
base_model:
- microsoft/swinv2-tiny-patch4-window16-256
pipeline_tag: image-classification
library_name: transformers
tags:
- safety
- XenArcAI
- SoTA
---
# XenArcAI
<p align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/m7ddTjuxxLUntdXVk0t5N.png"
alt="AIRealNet Banner"
width="90%"
style="border-radius:15px;"
/>
</p>
---
- [GitHub Repository](https://github.com/XenArcAI/AIRealNet)
- [Live Demo](https://huggingface.co/spaces/Parveshiiii/AIRealNet)
## Overview
In an era of rapidly advancing AI-generated imagery, deepfakes, and synthetic media, the need for reliable detection tools has never been higher. **AIRealNet** is a binary image classifier explicitly designed to distinguish **AI-generated images** from **real human photographs**. This model is optimized to detect conventional AI-generated content while adhering to strict privacy standards—avoiding personal or sensitive images.
* **Class 0:** AI-generated image
* **Class 1:** Real human image
By leveraging the robust **SwinV2 Tiny** architecture as its backbone, AIRealNet achieves a high degree of accuracy while remaining lightweight enough for practical deployment.
---
## Key Features
1. **High Accuracy on Public Datasets:**
Despite using a **14k-image fine-tuning split(Part of main fine tuning split)**, AIRealNet demonstrates exceptional accuracy and robustness in detecting AI-generated images.
2. **Balanced Training Split:**
The dataset contains a balanced number of AI-generated and real images, ensuring unbiased training and minimizing class imbalance issues.
* **AI-Generated:** 60%
* **Human-Images:** 40%
4. **Ethical Design:**
No personal photos were included, even if edited or AI-modified, respecting privacy and ethical AI principles.
5. **Fast and Scalable:**
Based on a transformer vision model, AIRealNet can be deployed efficiently in both research and production environments.
---
## Training Data
* **Dataset:** `Parveshiiii/AI-vs-Real` (open-sourced subset of main dataset )
* **Size:** 14k images (balanced between AI and human)
* **Split:** Used the train split for fine-tuning; validation performed on a separate balanced subset.
* **Notes:** Images sourced from public datasets and AI generation tools. Edited personal photos were intentionally excluded.
---
## Limitations
While AIRealNet performs exceptionally well on typical AI-generated images, users should note:
1. **Subtle Edits:** The model struggles with nano-scale edits or ultra-precise modifications, like “nano banana” edits.
2. **Edited Personal Images(over precise):** Images of real people that have been AI-modified are **not detected**, aligning with privacy and ethical guidelines.
3. **Domain Generalization:** Performance may vary on images from completely unseen AI generators or extremely unconventional content.
---
## Performance Metrics
> Metrics shown are from **Epoch 2**, chosen to illustrate stable performance after fine-tuning.
<p align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/3NVa0KLX0iAxTP2e6IlGH.png"
alt="AIRealNet Banner"
width="90%"
style="border-radius:15px;"
/>
</p>
**Note:** Extremely low loss and high accuracy are due to the controlled dataset environment. Real-world performance may be lower depending on the image domain.(In our testing this is model is over accurate despite it can't detect Nano-Banana images(only edited fully generated images can be detected over accurately))
---
## Demo and Usage
1. **Installing dependecies**
```python
pip install -U transformers
```
2. **Loading and running a demo**
```python
from transformers import pipeline
pipe = pipeline("image-classification", model="XenArcAI/AIRealNet")
pipe("https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/eVkKUTdiInUl6pbIUghQC.png")# example image
```
# Demo
* **Given Image**(Checkout Maths best filtered dataset focused on reasoning on XenArcAI)
<p align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/eVkKUTdiInUl6pbIUghQC.png"
alt="AIRealNet Banner"
width="90%"
style="border-radius:15px;"
/>
</p>
* **Model Output**
```bash
[{'label': 'artificial', 'score': 0.9865425825119019},
{'label': 'real', 'score': 0.013457471504807472}]
```
**Note:** its correct as the image was generated by a diffusion model
---
## Intended Use
* Detect AI-generated imagery on social media, research publications, and digital media platforms.
* Assist content moderators, researchers, and fact-checkers in identifying synthetic media.
* **Not intended** for legal verification without human corroboration.
---
## Ethical Considerations
* **Privacy-first Approach:** Personal photos, even if AI-edited, were excluded.
* **Responsible Deployment:** Users should combine model predictions with human review to avoid false positives or negatives.
* **Transparency:** The model card openly communicates its limitations and dataset design to prevent misuse.
---
## How It Works
1. Images are preprocessed and resized to `256x256`.
2. Features are extracted using the **SwinV2 Tiny** vision transformer backbone.
3. A binary classification head outputs probabilities for AI-generated vs real human images.
4. Predictions are interpreted as class 0 (AI) or class 1 (Human).
---
## Future Work
Future iterations aim to:
* Improve detection of subtle AI-generated edits and “nano banana” modifications.
* Expand training data with diverse AI generators to enhance generalization.
* Explore multi-modal detection capabilities (e.g., video, metadata, and image combined).
---
### Citation
```bibtex
@misc{xenarcai_sparkembedding_2025,
title={SparkEmbedding-300m: A Fine-Tuned Multilingual Embedding Model for Cross-Lingual Retrieval},
author={Parvesh Rawal},
publisher={Hugging Face},
year={2025},
url={https://huggingface.co/XenArcAI/AIRealNet}
}
```
## References
* Microsoft SwinV2 Tiny: [https://github.com/microsoft/Swin-Transformer](https://github.com/microsoft/Swin-Transformer)
* Parveshiiii/AI-vs-Real dataset (subset): Open-sourced by our team member
---
|