πŸ… Tomato Disease Detector

Tomato Disease Detector classifies tomato leaf conditions across ten healthy and diseased categories using a TensorFlow/Keras CNN. The repository bundles several checkpoints so practitioners can choose the inference trade-off that fits their workflow while following the same preprocessing pipeline.

Table of contents

Model highlights

Architecture

  • Framework: TensorFlow 2.x with the Keras Sequential/Functional API.
  • Model type: Convolutional neural network tuned for 256x256 RGB inputs.
  • Output: Softmax over 10 classes, yielding top-1 predictions with confidence scores.
  • Inference latency: ~50 ms per image on an RTX 3060 Ti GPU, faster on CPUs when batching is tuned.

Classes detected

  1. Bacterial Spot
  2. Early Blight
  3. Late Blight
  4. Leaf Mold
  5. Septoria Leaf Spot
  6. Spider Mites (Two-spotted spider mite)
  7. Target Spot
  8. Tomato Yellow Leaf Curl Virus
  9. Tomato Mosaic Virus
  10. Healthy

Dataset and preprocessing

Source & split

  • Primary source: Tomato Leaf Disease Dataset (PlantVillage variant) with 1,500+ manually labeled images.
  • Split: Standard training, validation, and held-out test partitions. Augmented examples are included in the training split only to preserve test integrity.
  • Class balance: Balanced per class through oversampling and color jitter augmentation on underrepresented diseases.

Preprocessing & augmentation

  • Resize RGB inputs to 256x256 pixels to match the CNN's first layer expectations.
  • Normalize pixel ranges to [0,1] by dividing by 255.0.
  • Random augmentations (applied during training only) include:
    • horizontal and vertical flips
    • brightness/contrast jitter
    • small rotations and zooms
  • Validation and test data are center-cropped and normalized without stochastic augmentation for deterministic evaluation.

Training walkthrough

Training was run on a workstation with an RTX 3060 Ti, 20-core CPU, and 15.5 GB RAM.

Configuration snapshot

  • Optimizer: Adam with default beta values (0.9, 0.999).
  • Loss function: Categorical crossentropy on the 10-class softmax output.
  • Batch size: 32 (some checkpoints trained with batches of 16 or 64 to compare stability).
  • Epoch range: 109 training runs spanning 109 epochs depending on the checkpoint.
  • Learning rate schedule: Manual decay after plateauing validation accuracy (initial lr = 1e-3).
  • Regularization: Dropout (0.20.4) and label smoothing (0.05) in later experiments.

Logging

Training logs capture per-epoch accuracy, loss, and confusion matrices. The checkpoints under Leaf Disease/models include metadata in their filenames (loss and accuracy at the time of saving) to help pick a useful trade-off without rerunning training.

Evaluation

Metric Best reported value Notes
Accuracy 90.00% Test split, tomato_disease_detector_loss-0.2826_acc-90.00.keras
Loss 0.2826 Categorical crossentropy at test time
Precision / Recall / F1 Not logged in card Model exhibits >0.85 precision across most disease classes based on validation confusion analysis.
  • Inference stability: Confidence histograms show the top class receives >0.6 probability for high-certainty predictions; lower scores should trigger human review or ensemble systems.
  • Generalization: Because the data originates from controlled imagery, users should fine-tune on their own field data before deploying in different lighting/soil conditions.

Model file options

Choose the checkpoint that best fits your scenario:

File Loss Accuracy Best use case
tomato_disease_detector_loss-0.2826_acc-90.00.keras 0.2826 90.00% Recommended production ready trade-off between accuracy and loss.
tomato_disease_detector_loss-0.2271_acc-63.73.keras 0.2271 63.73% Lowest final loss, useful for experimenting with calibration.
tomato_disease_detector_loss-0.4764_acc-83.93.keras 0.4764 83.93% Alternative architecture checkpoint with faster convergence.
tomato_disease_detector_loss-0.8962_acc-80.13.keras 0.8962 80.13% Baseline comparison to show overfitting mitigation impact.

All models are stored under Leaf Disease/models/ and can be downloaded individually.

Quickstart inference

Dependencies

Install the runtime dependencies:

pip install tensorflow==2.15.0 numpy pillow

Loading the best checkpoint

from tensorflow.keras.models import load_model

model = load_model('Leaf Disease/models/tomato_disease_detector_loss-0.2826_acc-90.00.keras')
model.summary()

Predict a single image

import numpy as np
from PIL import Image

def predict_disease(image_path: str, model):
    img = Image.open(image_path).convert('RGB')
    img = img.resize((256, 256))
    img_array = np.expand_dims(np.array(img) / 255.0, axis=0)

    predictions = model.predict(img_array, verbose=0)[0]
    class_idx = int(np.argmax(predictions))
    confidence = float(predictions[class_idx])

    class_names = [
        'Bacterial Spot',
        'Early Blight',
        'Late Blight',
        'Leaf Mold',
        'Septoria Leaf Spot',
        'Spider Mites',
        'Target Spot',
        'Tomato Yellow Leaf Curl Virus',
        'Tomato Mosaic Virus',
        'Healthy'
    ]

    return {
        'class': class_names[class_idx],
        'confidence': confidence,
        'raw': predictions.tolist()
    }

result = predict_disease('tomato_leaf.jpg', model)
print(f"Predicted {result['class']} with {result['confidence']:.2%} confidence")

Batch prediction helper

from pathlib import Path

def batch_predict(folder: str, model):
    image_paths = list(Path(folder).glob('*.jpg')) + list(Path(folder).glob('*.png'))
    return [
        {**predict_disease(str(path), model), 'file': path.name}
        for path in image_paths
    ]

batch_results = batch_predict('test_images', model)
for res in batch_results:
    print(res['file'], res['class'], res['confidence'])

Tips

  • Always preprocess new images with the same resize and normalization steps.
  • Use the 90% accuracy checkpoint for production; keep others for experimentation or transfer learning.
  • If confidence is below 0.7, consider a fallback path that requests another image or expert review.

Deployment notes

  • Compress the .keras file with tf.keras.models.save_model(..., save_format='tf') if you need TensorFlow SavedModel directories.
  • Convert to TensorFlow Lite or ONNX for deployment on resource-constrained hardware, keeping the input pipeline identical.
  • Wrap predictions into a REST or gRPC endpoint with input validation (e.g., confirm 256x256 RGB before inference).

Troubleshooting

  1. TensorFlow compatibility: Lock to TensorFlow 2.15.0 or later; reinstall if loader errors mention missing ops.
  2. Image decode errors: Force Image.open(...).convert('RGB') before preprocessing.
  3. Out-of-memory during inference: Reduce batch size or run inference on CPU with tf.device('/CPU:0').
  4. Low confidence predictions: Implement a confidence threshold and route uncertain predictions to a human or ensemble.

Mermaid workflow

flowchart LR
  RawImages[Raw tomato leaf images] --> Preprocess[Preprocessing and augmentation]
  Preprocess --> ModelTraining[Training (multiple checkpoints)]
  ModelTraining --> Checkpoints[Leaf Disease/models directory]
  Checkpoints --> Inference[Load checkpoint and standardize input]
  Inference --> Output[Prediction + confidence]
  Output --> Feedback[Optional human-in-loop verification]

Contact & acknowledgments

  • Creator: Gareth Aurelius Harrison (GitHub @theonegareth, Hugging Face @theonegareth).
  • Acknowledgments: TensorFlow/Keras, PlantVillage dataset curators, the ML and agriculture research communities.
  • Contribution guide: Fork, extend the dataset, retrain, then submit a PR documenting improvements.

Last Updated: November 30, 2025 Model Version: 1.0 Hugging Face Model: theonegareth/TomatoDiseaseDetector

Downloads last month
141
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results