Spaces:

evgueni-p
/

fbmc-chronos2

Sleeping

Evgueni Poloukarov Claude commited on Nov 16

Commit

ef3410d

1 Parent(s): c8d76da

perf: reduce context window from 512h to 256h to fit L4 GPU (24GB VRAM)

Memory Analysis:
- 615 features × 512h context requires ~35.4 GB VRAM
- L4 GPU only has 24 GB available
- Reducing to 256h context saves ~10 GB (halves KV cache)
- Expected memory: ~25 GB (fits within L4 limits)

Trade-off:
- Expected MAE increase: 134 MW -> ~145-155 MW
- Still meets <150 MW MVP threshold
- Full 512h context requires A100 80GB (documented for Phase 2)

Technical Details:
- Model: Chronos-2 120M params in bfloat16
- bfloat16 correctly applied (memory increase due to PyTorch float32 upcasting)
- torch.inference_mode() + model.eval() active
- No code errors found

Co-Authored-By: Claude <[email protected]>

Files changed (1) hide show

src/forecasting/chronos_inference.py +1 -1

src/forecasting/chronos_inference.py CHANGED Viewed

@@ -108,7 +108,7 @@ class ChronosInferencePipeline:
         run_date: str,
         borders: Optional[List[str]] = None,
         forecast_days: int = 7,
-        context_hours: int = 512,
         num_samples: int = 20
     ) -> Dict:
         """

         run_date: str,
         borders: Optional[List[str]] = None,
         forecast_days: int = 7,
+        context_hours: int = 256,
         num_samples: int = 20
     ) -> Dict:
         """