Sharris's picture
Upload folder using huggingface_hub
de3c81a verified
---
language: en
license: mit
tags: ["image-regression", "tensorflow", "mobilenetv2", "utkface", "age-estimation"]
datasets: ["UTKFace"]
metrics: ["mean_absolute_error"]
---
# UTKFace Age Regression β€” Model Card
This repository contains code to train a TensorFlow / Keras regression model that estimates a person's age from a face image using the UTKFace dataset. The model uses a MobileNetV2 backbone and a small regression head on top.
## Summary
- **Model type**: Image regression (single-output continuous)
- **Backbone**: MobileNetV2 (ImageNet pre-trained)
- **Task**: Age estimation (years)
- **Dataset**: UTKFace (public dataset; filenames encode age)
- **Reported metric**: Mean Absolute Error (MAE) β€” see Evaluation section for how to compute and report MAE for your runs
## Model details
- **Input**: RGB face image (recommended size: 224Γ—224)
- **Output**: Single scalar value β€” predicted age in years
- **Preprocessing**: MobileNetV2 preprocessing (scales inputs to [-1, 1])
- **Loss**: Mean Squared Error (MSE) used during training
- **Metric for reporting**: Mean Absolute Error (MAE)
## Intended uses
- Research and educational purposes for learning about image regression and age estimation
- Prototyping demo applications that predict approximate age ranges from face crops
## Out-of-scope / Limitations
- This model provides an estimate of age; it's not a substitute for official identification
- Models trained on UTKFace carry dataset biases (race, gender, age distribution). They may underperform on underrepresented groups.
- Do not use this model for high-stakes decision making (employment, legal, medical, etc.)
## Dataset
**UTKFace**
- **Source**: https://susanqq.github.io/UTKFace/
- **Format**: Filenames encode metadata as `<age>_<gender>_<race>_<date&time>.jpg`.
- **Usage**: The training scripts in this repo extract the age from the filename (the integer before the first underscore).
- **Note**: Respect the dataset's license and authors when redistributing or publishing results.
## Training details
- **Framework**: TensorFlow / Keras
- **Backbone**: MobileNetV2 pretrained on ImageNet
- **Head**: GlobalAveragePooling2D -> Dense(128, relu) -> Dense(1, linear)
- **Recommended input size**: 224Γ—224 (configurable via command-line args in `train.py`)
- **Batch size**: configurable (default set in `train.py`)
- **Optimizer**: Adam (default), learning rate and scheduler configurable in `train.py`
- **Loss**: Mean Squared Error (MSE)
- **Metric**: Mean Absolute Error (MAE) reported on validation/test sets
- **Augmentations**: Basic augmentations recommended (flip, random crop/brightness) for better robustness
## Reproducibility / Example training command
1. **Prepare UTKFace dataset**
- Download and extract UTKFace images into `data/UTKFace/` or pass `--dataset_dir` to the training script.
2. **Install dependencies**
- `python -m pip install -r requirements.txt`
3. **Train**
- `python train.py --dataset_dir data/UTKFace --epochs 30 --batch_size 32 --img_size 224 --output_dir saved_model`
The `train.py` script builds a tf.data pipeline, extracts ages from filenames, constructs a MobileNetV2-based model, and saves the trained model to the `--output_dir`.
## Evaluation and metrics (MAE)
Mean Absolute Error (MAE) gives an intuitive measure of average error in predicted age (in years):
```
MAE = mean(|y_true - y_pred|)
```
Compute MAE in Python (example):
```python
import numpy as np
mae = np.mean(np.abs(y_true - y_pred))
```
Example: the training script prints per-epoch validation MAE. To reproduce test MAE after training, run the provided evaluation routine or:
```python
from tensorflow import keras
import numpy as np
model = keras.models.load_model('saved_model')
# prepare test_images, test_labels arrays
preds = model.predict(test_images).squeeze()
mae = float(np.mean(np.abs(test_labels - preds)))
print('Test MAE (years):', mae)
```
Note: Exact MAE depends on preprocessing, train/validation split, augmentations, and hyperparameters. Report MAE alongside the exact training configuration for reproducibility.
## Usage β€” Quick examples
**Python (local SavedModel)**
```python
import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
model = tf.keras.models.load_model('saved_model') # path to a SavedModel directory
img = Image.open('path/to/face.jpg').convert('RGB').resize((224, 224))
arr = np.array(img, dtype=np.float32)
arr = preprocess_input(arr)
pred = model.predict(np.expand_dims(arr, 0))[0, 0]
print('Predicted age (years):', float(pred))
```
**Command-line (using predict.py)**
```
python predict.py --model_dir saved_model --image path/to/face.jpg
```
**Loading from Hugging Face Hub**
If you upload your saved model to the Hugging Face Hub, Consumers can download it using the `huggingface_hub` package. For example, in a Space, set the environment variable `HF_MODEL_ID` to the model repository (e.g. `username/my-age-model`) and the Gradio app supplied in this repo will attempt to download and use it.
**Gradio demo / Hugging Face Space**
A simple Gradio app is provided in `app.py` that:
- accepts an input face image
- preprocesses it (224Γ—224 + MobileNetV2 preprocess)
- returns the predicted age (years) and the model's raw output
**How to host as a Space**
1. Create a new Space on Hugging Face and select "Gradio" as the SDK.
2. Push this repository to the Space (include `app.py`, your `saved_model/` directory or set `HF_MODEL_ID` to your model on the Hub).
3. Make sure `requirements.txt` includes `gradio` and `huggingface_hub` (the repository `requirements.txt` in this project may be extended with these packages for the Space).
## Files in this repository
- `train.py` β€” training script
- `predict.py` β€” single-image prediction helper
- `convert_model.py` β€” conversion helpers
- `inference_log.py`, `inference_log.txt`, `load_predict_log.txt` β€” logging and CLI helpers for inference (dev)
- `app.py` β€” (added) Gradio demo app for live predictions
- `requirements.txt` β€” Python dependencies (extend for Spaces with `gradio` and `huggingface_hub`)
## Security, biases and ethical considerations
- Age estimation models can reflect and amplify biases in the training data (race and gender imbalance, age distribution). Evaluate fairness across demographic slices before using widely.
- Avoid using the model in high-risk contexts where inaccurate age estimates could cause harm.
## How to cite / license
- UTKFace authors and dataset should be cited if you publish results.
- This repository is provided under the MIT license (see LICENSE file if present).
## Contact and credits
**Maintainer**: Stealth Labs Ltd.
**Acknowledgements**
Thanks to the UTKFace dataset authors for the publicly available images used in training and experimentation.