IndoHoaxDetector / evaluation.md
theonegareth's picture
Improve IndoHoaxDetector Space docs, UI, and evaluation
79e7953

A newer version of the Gradio SDK is available: 6.8.0

Upgrade

Evaluation Summary for IndoHoaxDetector Space

Metrics Overview

  • Model Architecture: Logistic Regression trained on Indonesian news labeled as HOAX vs FAKTA.
  • Vectorizer: TF-IDF transform created with tfidf_vectorizer.pkl after applying Indonesian-specific preprocessing.
  • Accuracy: ~97.83% on the held-out validation split used during training (metadata stored in model_metadata.txt).
  • Precision & Recall: Balanced on the styled labels. Precision indicates how often the model flags hoax-style text correctly; recall shows how many hoax-like examples are captured.
  • Confidence Scores: The Gradio app exposes probability values for both labels. Use the HOAX probability as a stylistic warning, not a verdict.

Testing Guidelines

  1. Prepare a set of Indonesian news snippets (title + body) with known labels.
  2. Run the preprocessing steps defined in app.py (lowercasing, URL stripping, non-letter removal, stop word removal, Sastrawi stemming, TF-IDF transform).
  3. Use the loaded model to infer probabilities via predict_news.
  4. Compare predictions with labels and compute metrics with any evaluation script (e.g., run the repository-level evaluate.py if you copy it inside this folder).

Reporting

  • Document any changes to the dataset or vectorizer.
  • If you retrain the model, update this file with the new accuracy, precision, recall, and dataset description to keep the Space trustworthy.

Caveats

  • Metrics refer only to stylistic label consistency, not factual verification.
  • The evaluation set may not include every possible writing style; monitor drift over time and retrain as needed.