OcuNet v4 - Multi-Label Retinal Disease Classification

OcuNet v4 is an advanced multi-label deep learning model designed for ophthalmic disease screening using retinal fundus images. Based on the EfficientNet-B3 architecture, it classifies images into 30 distinct categories, including 28 specific diseases, a general “Disease Risk” class, and a “Normal” class. The model serves as a robust clinical decision support tool capable of detecting concurring pathologies in a single image.

Model Details

Model Type: Multi-Label Image Classifier
Architecture: EfficientNet-B3 (Pre-trained on ImageNet)
Input Resolution: 384x384 RGB
Loss Function: Asymmetric Loss (Optimized for heavily imbalanced multi-label datasets)
Framework: PyTorch
Version: OcuNet Phase 2 (v4.2.0)

Intended Use

Primary Use Case: Automated screening and diagnosis support of ophthalmic conditions from retinal fundus imagery.
Target Audience: Ophthalmologists, medical practitioners, and researchers in medical imaging.
Out of Scope: This model is intended for clinical decision support only and should not replace professional medical diagnosis.

Dataset

The model was trained on a comprehensive compilation of datasets comprising 23,659 images in total:

Phase 1 Dataset: ODIR-5K (Ocular Disease Intelligent Recognition)
Phase 2 Dataset: RFMiD (Retinal Fundus Multi-Disease Image Dataset)
Phase 3 Dataset: Proprietary Augmented Dataset addressing class imbalances

Data Splits:

Train: 16,240 images
Validation: 3,709 images
Test: 3,710 images

Class Distribution (30 Labels)

The model predicts the following conditions:

Disease_Risk, DR (Diabetic Retinopathy), ARMD (Age-related Macular Degeneration), MH (Macular Hole), DN (Diabetic Neuropathy), MYA (Myopia), BRVO, TSLN, ERM, LS, MS, CSR, ODC (Optic Disc Cupping), CRVO, AH, ODP, ODE, AION, PT, RT, RS, CRS, EDN, RPEC, MHL, CATARACT, GLAUCOMA, NORMAL, RD (Retinal Detachment), RP (Retinitis Pigmentosa)

Training Configuration

Batch Size: 16 (Gradient Accumulation Steps = 1)
Epochs: 200 (Early stopping triggered at Epoch 104)
Optimizer Learning Rate: 1.00e-07 to 3.00e-04 (Peak)
Warmup: 5 epochs
Hardware Profile: Trained on NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM) using advanced training techniques such as EMA (Exponential Moving Average) and class-specific threshold tuning (Calibration mapping).

Preprocessing & Augmentation

Preprocessing: Fundus ROI Crop (removes black borders) and CLAHE (Contrast Limited Adaptive Histogram Equalization) applied to the green channel.
Augmentation: RandAugment, Random Erasing, Color Jittering, and Geometric transformations (tuned specifically to avoid unrealistic medical artifacts).

Evaluation Results (Validation Set)

The model's optimal performance was achieved at Epoch 74:

Best Validation mean Average Precision (mAP): 0.4914
Best Validation F1-Score: 0.2517 (Macro)

(Note: Multi-label classification with 30 classes containing extremely rare and concurrent pathologies typically yields lower raw F1/mAP metric scales compared to binary classification. Per-class metrics showcase higher reliability on prevalent diseases like DR, Glaucoma, and Myopia).

How to run inference

You can use the model with the customized prediction pipeline included in the OcuNet repository:

from predict import ImprovedMultiLabelClassifier

# Initialize the model with confidence thresholds
classifier = ImprovedMultiLabelClassifier(
    checkpoint_path="models/ocunetv4.pth",
    config_path="config/config.yaml"
)

# Run Inference
result = classifier.predict("path/to/retinal_image.jpg")

# Print detected diseases
print(f"Detected: {result['detected_diseases']}")
for disease, prob in result['probabilities'].items():
    print(f"{disease}: {prob:.2%}")

Limitations and Bias

Class Imbalance: Despite using Asymmetric Loss and augmentation, extremely rare anomalies (e.g., MHL, ERM, RT) have fewer representations, which may lead to varying thresholds of sensitivity.
Image Quality Reliance: Performance may degrade significantly if input images exhibit uncorrected poor illumination or lack proper fundus anatomical visibility.
Generalization: Model is trained on adult fundus images; efficacy on pediatric patient populations is untested.

Disclaimer

This model is developed for research and educational purposes. It must thoroughly undergo clinical trials and obtain appropriate regulatory approval before deployment in real-world clinical environments.

Downloads last month: -; Downloads are not tracked for this model. How to track