Fashion Category Classifier
DistilBERT fine-tuned to classify Indian fashion product titles into 11 categories.
Credits
Built on DistilBERT by Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf (Hugging Face), licensed under Apache 2.0.
Labels
Tops Bottoms Dresses Outerwear Footwear Bags Jewellery Ethnicwear Activewear Innerwear Headwear
Usage
from transformers import pipeline
clf = pipeline("text-classification", model="roaringguts/DistilBERT")
clf("Nike Dry Fit Running Tshirt")
# [{'label': 'Activewear', 'score': 0.99}]
# batch
clf(["Lavie Women Tote Bag", "Malabar Gold Plated Necklace", "Clarks Oxford Shoes"])
Training
- Base model:
distilbert-base-uncased - Dataset: ~63k real product titles from Myntra + synthetic samples generated for underrepresented categories
- Split: 80/10/10 train/val/test, stratified
- Epochs: 5 (early stopping, patience 2)
- Batch size: 64
- Learning rate: 3e-5
- Precision: fp16
Evaluation
Evaluated on a held-out stratified test set.
| Metric | Score |
|---|---|
| Accuracy | 99.27% |
Limitations
- Trained on Indian e-commerce titles — may underperform on Western brand naming conventions
- Ethnicwear and Innerwear have some overlap (e.g. sports bras vs regular bras)
- Low sample count for Bags and Headwear in original data, partially filled with synthetic titles
License
This model is released under CC BY 4.0. The base DistilBERT weights it derives from are licensed under Apache 2.0.
- Downloads last month
- 137
Model tree for roaringguts/DistilBERT
Base model
distilbert/distilbert-base-uncased