π Tomato Disease Detector
Tomato Disease Detector classifies tomato leaf conditions across ten healthy and diseased categories using a TensorFlow/Keras CNN. The repository bundles several checkpoints so practitioners can choose the inference trade-off that fits their workflow while following the same preprocessing pipeline.
Table of contents
- Model highlights
- Dataset and preprocessing
- Training walkthrough
- Evaluation
- Model file options
- Quickstart inference
- Deployment notes
- Troubleshooting
- Mermaid workflow
Model highlights
Architecture
- Framework: TensorFlow 2.x with the Keras Sequential/Functional API.
- Model type: Convolutional neural network tuned for 256x256 RGB inputs.
- Output: Softmax over 10 classes, yielding top-1 predictions with confidence scores.
- Inference latency: ~50 ms per image on an RTX 3060 Ti GPU, faster on CPUs when batching is tuned.
Classes detected
- Bacterial Spot
- Early Blight
- Late Blight
- Leaf Mold
- Septoria Leaf Spot
- Spider Mites (Two-spotted spider mite)
- Target Spot
- Tomato Yellow Leaf Curl Virus
- Tomato Mosaic Virus
- Healthy
Dataset and preprocessing
Source & split
- Primary source: Tomato Leaf Disease Dataset (PlantVillage variant) with 1,500+ manually labeled images.
- Split: Standard training, validation, and held-out test partitions. Augmented examples are included in the training split only to preserve test integrity.
- Class balance: Balanced per class through oversampling and color jitter augmentation on underrepresented diseases.
Preprocessing & augmentation
- Resize RGB inputs to 256x256 pixels to match the CNN's first layer expectations.
- Normalize pixel ranges to [0,1] by dividing by 255.0.
- Random augmentations (applied during training only) include:
- horizontal and vertical flips
- brightness/contrast jitter
- small rotations and zooms
- Validation and test data are center-cropped and normalized without stochastic augmentation for deterministic evaluation.
Training walkthrough
Training was run on a workstation with an RTX 3060 Ti, 20-core CPU, and 15.5 GB RAM.
Configuration snapshot
- Optimizer: Adam with default beta values (0.9, 0.999).
- Loss function: Categorical crossentropy on the 10-class softmax output.
- Batch size: 32 (some checkpoints trained with batches of 16 or 64 to compare stability).
- Epoch range: 109 training runs spanning 109 epochs depending on the checkpoint.
- Learning rate schedule: Manual decay after plateauing validation accuracy (initial lr = 1e-3).
- Regularization: Dropout (0.20.4) and label smoothing (0.05) in later experiments.
Logging
Training logs capture per-epoch accuracy, loss, and confusion matrices. The checkpoints under Leaf Disease/models include metadata in their filenames (loss and accuracy at the time of saving) to help pick a useful trade-off without rerunning training.
Evaluation
| Metric | Best reported value | Notes |
|---|---|---|
| Accuracy | 90.00% | Test split, tomato_disease_detector_loss-0.2826_acc-90.00.keras |
| Loss | 0.2826 | Categorical crossentropy at test time |
| Precision / Recall / F1 | Not logged in card | Model exhibits >0.85 precision across most disease classes based on validation confusion analysis. |
- Inference stability: Confidence histograms show the top class receives >0.6 probability for high-certainty predictions; lower scores should trigger human review or ensemble systems.
- Generalization: Because the data originates from controlled imagery, users should fine-tune on their own field data before deploying in different lighting/soil conditions.
Model file options
Choose the checkpoint that best fits your scenario:
| File | Loss | Accuracy | Best use case |
|---|---|---|---|
tomato_disease_detector_loss-0.2826_acc-90.00.keras |
0.2826 | 90.00% | Recommended production ready trade-off between accuracy and loss. |
tomato_disease_detector_loss-0.2271_acc-63.73.keras |
0.2271 | 63.73% | Lowest final loss, useful for experimenting with calibration. |
tomato_disease_detector_loss-0.4764_acc-83.93.keras |
0.4764 | 83.93% | Alternative architecture checkpoint with faster convergence. |
tomato_disease_detector_loss-0.8962_acc-80.13.keras |
0.8962 | 80.13% | Baseline comparison to show overfitting mitigation impact. |
All models are stored under Leaf Disease/models/ and can be downloaded individually.
Quickstart inference
Dependencies
Install the runtime dependencies:
pip install tensorflow==2.15.0 numpy pillow
Loading the best checkpoint
from tensorflow.keras.models import load_model
model = load_model('Leaf Disease/models/tomato_disease_detector_loss-0.2826_acc-90.00.keras')
model.summary()
Predict a single image
import numpy as np
from PIL import Image
def predict_disease(image_path: str, model):
img = Image.open(image_path).convert('RGB')
img = img.resize((256, 256))
img_array = np.expand_dims(np.array(img) / 255.0, axis=0)
predictions = model.predict(img_array, verbose=0)[0]
class_idx = int(np.argmax(predictions))
confidence = float(predictions[class_idx])
class_names = [
'Bacterial Spot',
'Early Blight',
'Late Blight',
'Leaf Mold',
'Septoria Leaf Spot',
'Spider Mites',
'Target Spot',
'Tomato Yellow Leaf Curl Virus',
'Tomato Mosaic Virus',
'Healthy'
]
return {
'class': class_names[class_idx],
'confidence': confidence,
'raw': predictions.tolist()
}
result = predict_disease('tomato_leaf.jpg', model)
print(f"Predicted {result['class']} with {result['confidence']:.2%} confidence")
Batch prediction helper
from pathlib import Path
def batch_predict(folder: str, model):
image_paths = list(Path(folder).glob('*.jpg')) + list(Path(folder).glob('*.png'))
return [
{**predict_disease(str(path), model), 'file': path.name}
for path in image_paths
]
batch_results = batch_predict('test_images', model)
for res in batch_results:
print(res['file'], res['class'], res['confidence'])
Tips
- Always preprocess new images with the same resize and normalization steps.
- Use the 90% accuracy checkpoint for production; keep others for experimentation or transfer learning.
- If confidence is below 0.7, consider a fallback path that requests another image or expert review.
Deployment notes
- Compress the
.kerasfile withtf.keras.models.save_model(..., save_format='tf')if you need TensorFlow SavedModel directories. - Convert to TensorFlow Lite or ONNX for deployment on resource-constrained hardware, keeping the input pipeline identical.
- Wrap predictions into a REST or gRPC endpoint with input validation (e.g., confirm 256x256 RGB before inference).
Troubleshooting
- TensorFlow compatibility: Lock to TensorFlow 2.15.0 or later; reinstall if loader errors mention missing ops.
- Image decode errors: Force
Image.open(...).convert('RGB')before preprocessing. - Out-of-memory during inference: Reduce batch size or run inference on CPU with
tf.device('/CPU:0'). - Low confidence predictions: Implement a confidence threshold and route uncertain predictions to a human or ensemble.
Mermaid workflow
flowchart LR
RawImages[Raw tomato leaf images] --> Preprocess[Preprocessing and augmentation]
Preprocess --> ModelTraining[Training (multiple checkpoints)]
ModelTraining --> Checkpoints[Leaf Disease/models directory]
Checkpoints --> Inference[Load checkpoint and standardize input]
Inference --> Output[Prediction + confidence]
Output --> Feedback[Optional human-in-loop verification]
Contact & acknowledgments
- Creator: Gareth Aurelius Harrison (GitHub @theonegareth, Hugging Face @theonegareth).
- Acknowledgments: TensorFlow/Keras, PlantVillage dataset curators, the ML and agriculture research communities.
- Contribution guide: Fork, extend the dataset, retrain, then submit a PR documenting improvements.
Last Updated: November 30, 2025 Model Version: 1.0 Hugging Face Model: theonegareth/TomatoDiseaseDetector
- Downloads last month
- 141
Evaluation results
- Test Accuracy on Tomato Leaf Disease Datasettest set self-reported90.000
- Test Loss on Tomato Leaf Disease Datasettest set self-reported0.283