Fashion Category Classifier

DistilBERT fine-tuned to classify Indian fashion product titles into 11 categories.

Credits

Built on DistilBERT by Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf (Hugging Face), licensed under Apache 2.0.

Labels

Tops Bottoms Dresses Outerwear Footwear Bags Jewellery Ethnicwear Activewear Innerwear Headwear

Usage

from transformers import pipeline
clf = pipeline("text-classification", model="roaringguts/DistilBERT")
clf("Nike Dry Fit Running Tshirt")
# [{'label': 'Activewear', 'score': 0.99}]
# batch
clf(["Lavie Women Tote Bag", "Malabar Gold Plated Necklace", "Clarks Oxford Shoes"])

Training

  • Base model: distilbert-base-uncased
  • Dataset: ~63k real product titles from Myntra + synthetic samples generated for underrepresented categories
  • Split: 80/10/10 train/val/test, stratified
  • Epochs: 5 (early stopping, patience 2)
  • Batch size: 64
  • Learning rate: 3e-5
  • Precision: fp16

Evaluation

Evaluated on a held-out stratified test set.

Metric Score
Accuracy 99.27%

Limitations

  • Trained on Indian e-commerce titles — may underperform on Western brand naming conventions
  • Ethnicwear and Innerwear have some overlap (e.g. sports bras vs regular bras)
  • Low sample count for Bags and Headwear in original data, partially filled with synthetic titles

License

This model is released under CC BY 4.0. The base DistilBERT weights it derives from are licensed under Apache 2.0.

Downloads last month
137
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for roaringguts/DistilBERT

Finetuned
(11182)
this model