Spaces:
Sleeping
Sleeping
atodorov284
commited on
Commit
Β·
5dbbc0c
1
Parent(s):
c285c60
Update README with all info from the development stage.
Browse files
README.md
CHANGED
|
@@ -10,7 +10,17 @@ In the Netherlands, cities like Utrecht experience challenges concerning air qua
|
|
| 10 |
|
| 11 |
## How To Run This Code
|
| 12 |
|
| 13 |
-
Currently, this repository
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
The notebooks in this project were used as scratch for analysis and data merge and do not reflect our thorough methodology (source is under air-quality-forecast). Some extra scripts for the generation of our plots in the report can be found under extra_scripts.
|
| 16 |
|
|
@@ -49,6 +59,8 @@ The notebooks in this project were used as scratch for analysis and data merge a
|
|
| 49 |
β
|
| 50 |
βββ configs <- Configuration folder for the hyperparameter search space (for now)
|
| 51 |
β
|
|
|
|
|
|
|
| 52 |
βββ extra_scripts <- Some extra scripts in R and .tex to generate figures
|
| 53 |
β
|
| 54 |
βββ air-quality-forecast <- Source code for use in this project.
|
|
@@ -57,11 +69,13 @@ The notebooks in this project were used as scratch for analysis and data merge a
|
|
| 57 |
β
|
| 58 |
βββ data_pipeline.py <- Loads, extracts, and preprocesses the data. Final result is the train-test under data/processed
|
| 59 |
β
|
| 60 |
-
βββ model_development.py <- Trains the three models using k-fold CV and Bayesian hyperparameter tuning
|
|
|
|
|
|
|
| 61 |
β
|
| 62 |
βββ utils.py <- Utility functions, e.g. validation
|
| 63 |
β
|
| 64 |
-
βββ main.py <- To execute and start the project
|
| 65 |
|
| 66 |
--------
|
| 67 |
|
|
|
|
| 10 |
|
| 11 |
## How To Run This Code
|
| 12 |
|
| 13 |
+
Currently, this repository finished the model development stage.
|
| 14 |
+
|
| 15 |
+
To run the data pipeline, run `data_pipeline.py` under air-quality forecast, which is the folder that contains the source code of this project. The processed and split datasets can be found under data/processed, namely x_train, x_val, x_test, y_train, y_val, y_test.
|
| 16 |
+
|
| 17 |
+
To see the MLFlow dashboard, used to track experiments, run model_development.py. It will automatically create a server at your localhost port 5000. If this does not work, please run
|
| 18 |
+
`mlflow ui --port 5000`
|
| 19 |
+
in your console. You might need to give admin permissions to this process. The MLFlow dashboard contains all information about the experiments ran, including hyperparameters selected for each model. The selected models can be found under the Models menu.
|
| 20 |
+
|
| 21 |
+
To run the prediction, run `main.py`. It will display the MSE and RMSE of the train and test data for all three models.
|
| 22 |
+
|
| 23 |
+
## DISCLAIMER
|
| 24 |
|
| 25 |
The notebooks in this project were used as scratch for analysis and data merge and do not reflect our thorough methodology (source is under air-quality-forecast). Some extra scripts for the generation of our plots in the report can be found under extra_scripts.
|
| 26 |
|
|
|
|
| 59 |
β
|
| 60 |
βββ configs <- Configuration folder for the hyperparameter search space (for now)
|
| 61 |
β
|
| 62 |
+
βββ saved_models <- Folder with the saved models in `.pkl` and `.xgb`.
|
| 63 |
+
β
|
| 64 |
βββ extra_scripts <- Some extra scripts in R and .tex to generate figures
|
| 65 |
β
|
| 66 |
βββ air-quality-forecast <- Source code for use in this project.
|
|
|
|
| 69 |
β
|
| 70 |
βββ data_pipeline.py <- Loads, extracts, and preprocesses the data. Final result is the train-test under data/processed
|
| 71 |
β
|
| 72 |
+
βββ model_development.py <- Trains the three models using k-fold CV and Bayesian hyperparameter tuning, displays the ML β dashboard if executed
|
| 73 |
+
β
|
| 74 |
+
βββ prediction.py <- Loads the models and makes an example prediction
|
| 75 |
β
|
| 76 |
βββ utils.py <- Utility functions, e.g. validation
|
| 77 |
β
|
| 78 |
+
βββ main.py <- To execute and start the project. Currently to make predictions.
|
| 79 |
|
| 80 |
--------
|
| 81 |
|