Spaces:
Sleeping
Sleeping
Evgueni Poloukarov
commited on
Commit
·
c6bf910
1
Parent(s):
10c4205
docs: update activity log with HF Space deployment milestone
Browse files- doc/activity.md +176 -0
doc/activity.md
CHANGED
|
@@ -5124,3 +5124,179 @@ load_forecast_cols[timestamp > d1_cutoff] = np.nan
|
|
| 5124 |
**Next Session**: Deploy to HF Space, run time-travel validation tests
|
| 5125 |
|
| 5126 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5124 |
**Next Session**: Deploy to HF Space, run time-travel validation tests
|
| 5125 |
|
| 5126 |
---
|
| 5127 |
+
|
| 5128 |
+
## Session 7: HuggingFace Space Deployment (Nov 14, 2025)
|
| 5129 |
+
|
| 5130 |
+
### Objectives
|
| 5131 |
+
1. Extend dataset to Oct 14, 2025 for multivariate forecasting
|
| 5132 |
+
2. Create production-ready Jupyter notebooks for HF Space
|
| 5133 |
+
3. Deploy to HuggingFace Space with Docker and GPU support
|
| 5134 |
+
4. Enable zero-shot inference testing on A10G GPU
|
| 5135 |
+
|
| 5136 |
+
### 1. Dataset Extension (Oct 2025 Data Processing)
|
| 5137 |
+
|
| 5138 |
+
**Problem**: Dataset ended Sept 30, 2025, but dynamic forecasting with `run_date=Sept 30, 23:00` requires Oct 1-14 future covariates (336 hours) for 14-day forecast.
|
| 5139 |
+
|
| 5140 |
+
**Solution**: Process October raw data and extend unified dataset.
|
| 5141 |
+
|
| 5142 |
+
#### Scripts Created
|
| 5143 |
+
- **`process_october_features.py`** (341 lines)
|
| 5144 |
+
- Processes weather and ENTSO-E raw data for Oct 1-14
|
| 5145 |
+
- Applies existing feature engineering modules
|
| 5146 |
+
- Output: 336 rows × 840 features (weather + ENTSO-E)
|
| 5147 |
+
|
| 5148 |
+
- **`extend_dataset.py`** (195 lines)
|
| 5149 |
+
- Merges October features with 24-month baseline
|
| 5150 |
+
- Handles dtype mismatches (176 columns fixed: Float64 → Int64)
|
| 5151 |
+
- Adds missing JAO features (1,736 columns) filled with nulls
|
| 5152 |
+
- Output: 17,880 rows × 2,553 features (Oct 2023 - Oct 14, 2025)
|
| 5153 |
+
|
| 5154 |
+
- **`upload_to_hf.py`** (146 lines)
|
| 5155 |
+
- Uploads extended dataset to HuggingFace
|
| 5156 |
+
- Replaces 24-month dataset with 24.5-month version
|
| 5157 |
+
- Dataset: `evgueni-p/fbmc-features-24month`
|
| 5158 |
+
|
| 5159 |
+
**Results**:
|
| 5160 |
+
- Dataset extended: 17,544 → 17,880 rows (+336 hours)
|
| 5161 |
+
- Date range: Oct 1, 2023 - Oct 14, 2025 (24.5 months)
|
| 5162 |
+
- Upload verified: 17,880 rows × 2,553 columns on HuggingFace
|
| 5163 |
+
- No data leakage: October data only available as future covariates
|
| 5164 |
+
|
| 5165 |
+
### 2. Production Jupyter Notebooks
|
| 5166 |
+
|
| 5167 |
+
Created 3 notebooks for HuggingFace Space:
|
| 5168 |
+
|
| 5169 |
+
#### **`inference_smoke_test.ipynb`**
|
| 5170 |
+
- **Purpose**: Quick validation (1 border × 7 days, ~1 min)
|
| 5171 |
+
- **Configuration**:
|
| 5172 |
+
- Run date: Sept 30, 2025 23:00
|
| 5173 |
+
- Forecast: Oct 1-7 (168 hours)
|
| 5174 |
+
- Context: 512 hours
|
| 5175 |
+
- Single border test
|
| 5176 |
+
- **Features**:
|
| 5177 |
+
- Environment setup with GPU detection
|
| 5178 |
+
- Dataset loading from HuggingFace
|
| 5179 |
+
- Dynamic forecast system integration
|
| 5180 |
+
- Chronos-2 model loading on GPU
|
| 5181 |
+
- Zero-shot inference with visualization
|
| 5182 |
+
|
| 5183 |
+
#### **`inference_full_14day.ipynb`**
|
| 5184 |
+
- **Purpose**: Production run (38 borders × 14 days, ~5 min)
|
| 5185 |
+
- **Configuration**:
|
| 5186 |
+
- Run date: Sept 30, 2025 23:00
|
| 5187 |
+
- Forecast: Oct 1-14 (336 hours)
|
| 5188 |
+
- Context: 512 hours
|
| 5189 |
+
- All 38 borders
|
| 5190 |
+
- **Features**:
|
| 5191 |
+
- Batch processing with progress tracking
|
| 5192 |
+
- Per-border inference timing
|
| 5193 |
+
- Forecast export to parquet
|
| 5194 |
+
- Sample visualizations (4 borders)
|
| 5195 |
+
- Performance summary statistics
|
| 5196 |
+
|
| 5197 |
+
#### **`evaluation.ipynb`**
|
| 5198 |
+
- **Purpose**: Performance analysis vs Oct 1-14 actuals
|
| 5199 |
+
- **Metrics**:
|
| 5200 |
+
- D+1 MAE (first 24 hours) - Target: <150 MW
|
| 5201 |
+
- 14-day MAE (full horizon)
|
| 5202 |
+
- RMSE, MAPE across all borders
|
| 5203 |
+
- Best/worst border identification
|
| 5204 |
+
- **Outputs**:
|
| 5205 |
+
- Performance distribution histogram
|
| 5206 |
+
- Forecast vs actual comparison charts
|
| 5207 |
+
- CSV export of results
|
| 5208 |
+
|
| 5209 |
+
### 3. HuggingFace Space Configuration
|
| 5210 |
+
|
| 5211 |
+
#### **Dockerfile** (Docker SDK for GPU)
|
| 5212 |
+
```dockerfile
|
| 5213 |
+
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
|
| 5214 |
+
WORKDIR /app
|
| 5215 |
+
COPY requirements.txt .
|
| 5216 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
| 5217 |
+
COPY src/ ./src/
|
| 5218 |
+
COPY inference_smoke_test.ipynb .
|
| 5219 |
+
COPY inference_full_14day.ipynb .
|
| 5220 |
+
COPY evaluation.ipynb .
|
| 5221 |
+
EXPOSE 7860
|
| 5222 |
+
CMD ["jupyter", "lab", "--ip=0.0.0.0", "--port=7860", "--no-browser",
|
| 5223 |
+
"--allow-root", "--NotebookApp.token=''", "--NotebookApp.password=''"]
|
| 5224 |
+
```
|
| 5225 |
+
|
| 5226 |
+
#### **README.md** (Space Metadata)
|
| 5227 |
+
- SDK: `docker` (changed from `jupyterlab` - not supported)
|
| 5228 |
+
- Hardware: `a10g-small` (NVIDIA A10G, 24GB VRAM)
|
| 5229 |
+
- License: MIT
|
| 5230 |
+
- Features: 2,553 engineered features, 38 borders
|
| 5231 |
+
- Model: Amazon Chronos-2 Large (710M params)
|
| 5232 |
+
|
| 5233 |
+
#### **requirements.txt** (GPU Dependencies)
|
| 5234 |
+
- Core ML: torch>=2.0.0, transformers>=4.35.0, chronos-forecasting>=1.2.0
|
| 5235 |
+
- Data: polars>=0.19.0, datasets>=2.14.0, pyarrow>=13.0.0
|
| 5236 |
+
- Viz: altair>=5.0.0
|
| 5237 |
+
- Jupyter: ipykernel, jupyter, jupyterlab
|
| 5238 |
+
|
| 5239 |
+
### 4. Deployment
|
| 5240 |
+
|
| 5241 |
+
**Git Operations**:
|
| 5242 |
+
```bash
|
| 5243 |
+
git add README.md requirements.txt Dockerfile inference_smoke_test.ipynb \
|
| 5244 |
+
inference_full_14day.ipynb evaluation.ipynb src/forecasting/
|
| 5245 |
+
git commit -m "feat: add HF Space deployment with Docker and Jupyter notebooks"
|
| 5246 |
+
git push origin master # GitHub repo
|
| 5247 |
+
git push hf-space master:main # HuggingFace Space
|
| 5248 |
+
```
|
| 5249 |
+
|
| 5250 |
+
**Results**:
|
| 5251 |
+
- HuggingFace Space: https://huggingface.co/spaces/evgueni-p/fbmc-chronos2-forecast
|
| 5252 |
+
- GitHub Repo: https://github.com/evgspacdmy/fbmc_chronos2
|
| 5253 |
+
- Dataset: https://huggingface.co/datasets/evgueni-p/fbmc-features-24month
|
| 5254 |
+
|
| 5255 |
+
### Files Created
|
| 5256 |
+
- `process_october_features.py` (341 lines)
|
| 5257 |
+
- `extend_dataset.py` (195 lines)
|
| 5258 |
+
- `upload_to_hf.py` (146 lines)
|
| 5259 |
+
- `Dockerfile` (17 lines)
|
| 5260 |
+
- `inference_smoke_test.ipynb` (16 cells)
|
| 5261 |
+
- `inference_full_14day.ipynb` (8 cells)
|
| 5262 |
+
- `evaluation.ipynb` (8 cells)
|
| 5263 |
+
|
| 5264 |
+
### Files Modified
|
| 5265 |
+
- `README.md` - Changed SDK from jupyterlab to docker
|
| 5266 |
+
- `requirements.txt` - Renamed from requirements_hf_space.txt
|
| 5267 |
+
|
| 5268 |
+
### Key Decisions
|
| 5269 |
+
1. **Docker SDK**: Required for Jupyter deployment on HF Spaces (jupyterlab SDK not supported)
|
| 5270 |
+
2. **No Gradio**: User confirmed Jupyter notebooks only (previous Gradio app archived)
|
| 5271 |
+
3. **October extension**: Essential for multivariate forecasting with Sept 30 run date
|
| 5272 |
+
4. **JAO features**: Filled with nulls for October (no API data available)
|
| 5273 |
+
5. **Dataset naming**: Kept `fbmc-features-24month` (backwards compatible)
|
| 5274 |
+
|
| 5275 |
+
### Technical Challenges Resolved
|
| 5276 |
+
1. **Datetime precision mismatch**: Fixed μs vs ns timezone issues in Polars
|
| 5277 |
+
2. **Dtype mismatches**: Cast 176 Float64 columns to Int64 to match schema
|
| 5278 |
+
3. **HF SDK error**: Changed from unsupported `jupyterlab` to `docker`
|
| 5279 |
+
4. **Missing October JAO**: Filled 1,736 columns with nulls (expected behavior)
|
| 5280 |
+
5. **Forward-fill**: Oct 14 ENTSO-E data missing, forward-filled from Oct 13
|
| 5281 |
+
|
| 5282 |
+
### Testing Status
|
| 5283 |
+
- Dataset upload: ✅ Verified 17,880 rows on HuggingFace
|
| 5284 |
+
- Git deployment: ✅ Pushed to both GitHub and HF Space
|
| 5285 |
+
- Docker build: ⏳ Pending (Space is building)
|
| 5286 |
+
- GPU inference: ⏳ Pending (awaiting Space startup)
|
| 5287 |
+
- MAE validation: ⏳ Pending (requires running evaluation notebook)
|
| 5288 |
+
|
| 5289 |
+
### Next Steps
|
| 5290 |
+
1. **Configure HF Space**: Set `HF_TOKEN` secret for private dataset access
|
| 5291 |
+
2. **Test smoke test**: Run on A10G GPU, verify 1-border inference works
|
| 5292 |
+
3. **Test full inference**: Run all 38 borders, validate 5-minute target
|
| 5293 |
+
4. **Run evaluation**: Compare vs Oct 1-14 actuals, document MAE
|
| 5294 |
+
5. **Update activity.md**: Final results and handover documentation
|
| 5295 |
+
|
| 5296 |
+
---
|
| 5297 |
+
|
| 5298 |
+
**Status**: [IN PROGRESS] HF Space deployed, awaiting build completion
|
| 5299 |
+
**Timestamp**: 2025-11-14 12:30 UTC
|
| 5300 |
+
**Next Session**: Configure Space secrets, test notebooks on GPU, evaluate MAE
|
| 5301 |
+
|
| 5302 |
+
---
|