OcuNet v4 - Multi-Label Retinal Disease Classification
OcuNet v4 is an advanced multi-label deep learning model designed for ophthalmic disease screening using retinal fundus images. Based on the EfficientNet-B3 architecture, it classifies images into 30 distinct categories, including 28 specific diseases, a general “Disease Risk” class, and a “Normal” class. The model serves as a robust clinical decision support tool capable of detecting concurring pathologies in a single image.
Model Details
- Model Type: Multi-Label Image Classifier
- Architecture: EfficientNet-B3 (Pre-trained on ImageNet)
- Input Resolution: 384x384 RGB
- Loss Function: Asymmetric Loss (Optimized for heavily imbalanced multi-label datasets)
- Framework: PyTorch
- Version: OcuNet Phase 2 (v4.2.0)
Intended Use
- Primary Use Case: Automated screening and diagnosis support of ophthalmic conditions from retinal fundus imagery.
- Target Audience: Ophthalmologists, medical practitioners, and researchers in medical imaging.
- Out of Scope: This model is intended for clinical decision support only and should not replace professional medical diagnosis.
Dataset
The model was trained on a comprehensive compilation of datasets comprising 23,659 images in total:
- Phase 1 Dataset: ODIR-5K (Ocular Disease Intelligent Recognition)
- Phase 2 Dataset: RFMiD (Retinal Fundus Multi-Disease Image Dataset)
- Phase 3 Dataset: Proprietary Augmented Dataset addressing class imbalances
Data Splits:
- Train: 16,240 images
- Validation: 3,709 images
- Test: 3,710 images
Class Distribution (30 Labels)
The model predicts the following conditions:
Disease_Risk,DR(Diabetic Retinopathy),ARMD(Age-related Macular Degeneration),MH(Macular Hole),DN(Diabetic Neuropathy),MYA(Myopia),BRVO,TSLN,ERM,LS,MS,CSR,ODC(Optic Disc Cupping),CRVO,AH,ODP,ODE,AION,PT,RT,RS,CRS,EDN,RPEC,MHL,CATARACT,GLAUCOMA,NORMAL,RD(Retinal Detachment),RP(Retinitis Pigmentosa)
Training Configuration
- Batch Size: 16 (Gradient Accumulation Steps = 1)
- Epochs: 200 (Early stopping triggered at Epoch 104)
- Optimizer Learning Rate: 1.00e-07 to 3.00e-04 (Peak)
- Warmup: 5 epochs
- Hardware Profile: Trained on NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM) using advanced training techniques such as EMA (Exponential Moving Average) and class-specific threshold tuning (Calibration mapping).
Preprocessing & Augmentation
- Preprocessing: Fundus ROI Crop (removes black borders) and CLAHE (Contrast Limited Adaptive Histogram Equalization) applied to the green channel.
- Augmentation: RandAugment, Random Erasing, Color Jittering, and Geometric transformations (tuned specifically to avoid unrealistic medical artifacts).
Evaluation Results (Validation Set)
The model's optimal performance was achieved at Epoch 74:
- Best Validation mean Average Precision (mAP): 0.4914
- Best Validation F1-Score: 0.2517 (Macro)
(Note: Multi-label classification with 30 classes containing extremely rare and concurrent pathologies typically yields lower raw F1/mAP metric scales compared to binary classification. Per-class metrics showcase higher reliability on prevalent diseases like DR, Glaucoma, and Myopia).
How to run inference
You can use the model with the customized prediction pipeline included in the OcuNet repository:
from predict import ImprovedMultiLabelClassifier
# Initialize the model with confidence thresholds
classifier = ImprovedMultiLabelClassifier(
checkpoint_path="models/ocunetv4.pth",
config_path="config/config.yaml"
)
# Run Inference
result = classifier.predict("path/to/retinal_image.jpg")
# Print detected diseases
print(f"Detected: {result['detected_diseases']}")
for disease, prob in result['probabilities'].items():
print(f"{disease}: {prob:.2%}")
Limitations and Bias
- Class Imbalance: Despite using Asymmetric Loss and augmentation, extremely rare anomalies (e.g., MHL, ERM, RT) have fewer representations, which may lead to varying thresholds of sensitivity.
- Image Quality Reliance: Performance may degrade significantly if input images exhibit uncorrected poor illumination or lack proper fundus anatomical visibility.
- Generalization: Model is trained on adult fundus images; efficacy on pediatric patient populations is untested.
Disclaimer
This model is developed for research and educational purposes. It must thoroughly undergo clinical trials and obtain appropriate regulatory approval before deployment in real-world clinical environments.