--- title: AIDetector emoji: 📉 colorFrom: purple colorTo: pink sdk: gradio sdk_version: 5.45.0 app_file: app.py pinned: false license: mit --- # Advanced AI Text Detector 🔍 An advanced AI text detection system that identifies AI-generated content, particularly from ChatGPT and similar language models. ## Features ### 🤖 Dual Detection Methods - **Transformer-based Detection**: Uses fine-tuned RoBERTa model specifically trained on ChatGPT detection - **Statistical Analysis**: Employs multiple linguistic metrics for robust detection ### 📊 Comprehensive Analysis Metrics - **Burstiness Analysis**: Measures sentence length variation (human text is typically more "bursty") - **Vocabulary Diversity**: Analyzes lexical richness and word variety - **Repetition Detection**: Identifies repeated phrases and patterns - **Perplexity Scoring**: Evaluates text predictability - **Punctuation Patterns**: Analyzes punctuation consistency ### 🎯 High Accuracy Features - Multi-method ensemble approach for improved accuracy - Confidence scoring system - Detailed explanations for each detection - Visual probability distribution ## How It Works 1. **Input Processing**: The text is tokenized and prepared for analysis 2. **Transformer Analysis**: If available, the RoBERTa model provides initial AI probability 3. **Statistical Analysis**: Multiple linguistic features are extracted and analyzed 4. **Score Combination**: Results are weighted and combined for final prediction 5. **Result Generation**: Detailed report with classification, confidence, and explanations ## Detection Categories - **AI-Generated**: >80% AI probability (High confidence) - **Likely AI-Generated**: 60-80% AI probability (Medium confidence) - **Uncertain**: 40-60% AI probability (Low confidence) - **Likely Human-Written**: 20-40% AI probability (Medium confidence) - **Human-Written**: <20% AI probability (High confidence) ## Usage Tips - Provide at least 100 words for optimal accuracy - Longer texts generally yield more reliable results - The detector works best with English text - Results are probabilistic - use them as guidance, not absolute truth ## Technical Stack - **Gradio**: Interactive web interface - **Transformers**: Hugging Face transformer models - **PyTorch**: Deep learning backend - **SciPy/NumPy**: Statistical analysis ## Limitations - Best performance with English text - Requires sufficient text length (minimum 50 characters, optimal 100+ words) - Detection accuracy may vary with highly technical or specialized content - Should be used as a tool for guidance, not definitive judgment ## Deployment This app is designed to run on Hugging Face Spaces. Simply upload the files to your Space and it will automatically deploy. ## Model Credit This detector uses the `Hello-SimpleAI/chatgpt-detector-roberta` model from Hugging Face, combined with custom statistical analysis methods. --- **Note**: AI detection is a rapidly evolving field. No detector is 100% accurate, and results should be interpreted with appropriate context and judgment.