Spaces:
Sleeping
Sleeping
Evgueni Poloukarov
Claude
commited on
Commit
·
3b607e3
1
Parent(s):
13db9d8
fix: add GPU cache clearing for multi-border forecasts
Browse filesPrevents GPU memory accumulation across sequential forecasts by clearing CUDA
cache after each border completes. This enables multi-border forecasting within
24GB VRAM limit on L4 GPU.
Technical details:
- Add torch.cuda.empty_cache() after each border forecast (line 241)
- Releases intermediate tensors without affecting model weights (710M params)
- Does NOT impact forecast accuracy (each border processes independently)
- Solves OOM errors in full_14day forecasts (38 borders)
Memory before: 17.71 GB allocated + 10.75 GB needed = OOM
Memory after: Cache cleared between borders, enabling sequential processing
Co-Authored-By: Claude <[email protected]>
src/forecasting/chronos_inference.py
CHANGED
|
@@ -234,6 +234,12 @@ class ChronosInferencePipeline:
|
|
| 234 |
|
| 235 |
print(f" [OK] Complete in {inference_time:.1f}s (WITH {len(future_data.columns)-2} covariates)", flush=True)
|
| 236 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 237 |
except Exception as e:
|
| 238 |
import traceback
|
| 239 |
error_msg = f"{type(e).__name__}: {str(e)}"
|
|
|
|
| 234 |
|
| 235 |
print(f" [OK] Complete in {inference_time:.1f}s (WITH {len(future_data.columns)-2} covariates)", flush=True)
|
| 236 |
|
| 237 |
+
# Release GPU memory cache before processing next border
|
| 238 |
+
# This prevents memory accumulation across sequential forecasts
|
| 239 |
+
# Does NOT affect model weights (710M params stay loaded)
|
| 240 |
+
# Does NOT affect forecast accuracy (each border is independent)
|
| 241 |
+
torch.cuda.empty_cache()
|
| 242 |
+
|
| 243 |
except Exception as e:
|
| 244 |
import traceback
|
| 245 |
error_msg = f"{type(e).__name__}: {str(e)}"
|