{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Hydrological modelling - Raven (distributed)\n", "\n", "`xHydro` provides a collection of functions designed to facilitate hydrological modelling, focusing on two key models: [HYDROTEL](https://github.com/INRS-Modelisation-hydrologique/hydrotel) and a suite of models emulated by the [Raven Hydrological Framework](https://raven.uwaterloo.ca/). It is important to note that Raven already possesses an extensive Python library, [RavenPy](https://github.com/CSHS-CWRA/RavenPy), which enables users to build, calibrate, and execute models. `xHydro` wraps some of these functions to support multi-model assessments with HYDROTEL, though users seeking advanced functionalities may prefer to use `RavenPy` directly. \n", "\n", "The primary contribution of `xHydro` to hydrological modelling is thus its support for HYDROTEL, a model that previously lacked a dedicated Python library. This Notebook covers `RavenPy` models, but a similar notebook for `HYDROTEL` is available [here](../hydrological_modelling_hydrotel.ipynb).\n", "\n", "## Basic information" ] }, { "cell_type": "code", "execution_count": null, "id": "1", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "from IPython.display import clear_output\n", "\n", "import xhydro as xh\n", "import xhydro.modelling as xhm\n", "\n", "clear_output(wait=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "2", "metadata": { "editable": true, "nbsphinx": "hidden", "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "import logging\n", "\n", "logger = logging.getLogger()\n", "logger.setLevel(logging.CRITICAL)" ] }, { "cell_type": "markdown", "id": "3", "metadata": {}, "source": [ "The `xHydro` modelling framework is based on a `model_config` dictionary, which is meant to contain all necessary information to execute a given hydrological model. For example, depending on the model, it can store meteorological datasets directly, paths to datasets (netCDF files or other), csv configuration files, parameters, and basically anything that is required to configure and execute an hydrological model.\n", "\n", "The list of required inputs for the dictionary can be obtained one of two ways. The first is to look at the hydrological model's class, such as `xhydro.modelling.RavenpyModel`. The second is to use the `xh.modelling.get_hydrological_model_inputs` function to get a list of the required keys for a given model, as well as the documentation." ] }, { "cell_type": "code", "execution_count": null, "id": "4", "metadata": {}, "outputs": [], "source": [ "help(xhm.get_hydrological_model_inputs)" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "# This function can be called to get a list of the keys for a given model, as well as its documentation.\n", "inputs, docs = xhm.get_hydrological_model_inputs(\"HBVEC\", required_only=False)\n", "inputs" ] }, { "cell_type": "code", "execution_count": null, "id": "6", "metadata": {}, "outputs": [], "source": [ "print(docs)" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "HYDROTEL and Raven vary in terms of required inputs and available functions, but an effort will be made to standardize the outputs as much as possible. Currently, all models include the following three functions:\n", "\n", "- `.run()`: Executes the model, reformats the outputs to be compatible with analysis tools in `xHydro`, and returns the simulated streamflow as a `xarray.Dataset`.\n", " - The streamflow variable will be named `q` and will have units of `m3 s-1`.\n", " - For 1D data (such as hydrometric stations), the corresponding dimension in the dataset will be identified by the `cf_role: timeseries_id` attribute.\n", " \n", "- `.get_inputs()`: Retrieves the meteorological inputs used by the model.\n", "\n", "- `.get_outputs()`: Retrieves the simulated outputs from the model.\n", " - Use `.get_outputs(\"q\")` to obtain the simulated streamflow as a `xarray.Dataset`.\n", "\n", "- `.standardize_outputs()`: Standardizes the outputs to ensure consistency across different models, facilitating comparison and analysis. This function is used by default in the `.run()` method, but can also be called separately if needed." ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Initializing and running a calibrated model\n", "Raven requires several `.rv*` files to control various aspects such as meteorological inputs, watershed characteristics, and more. If the project directory already exists and contains data, `xHydro` will prepare the model for execution without overwriting existing `.rv*` files—unless the `overwrite` argument is explicitly set to `True`. To force overwriting of these files, you can thus either:\n", "\n", "- Set `overwrite=True` in the `model_config` when instantiating the model\n", "- Use the `.create_rv(overwrite=True)` method on the instantiated model.\n", "\n", "This Notebook will focus on distributed RavenPy models. For lumped models, refer to the [Raven lumped modelling notebook](../hydrological_modelling_raven.ipynb).\n", "\n", "### Formatting HRU Data for distributed models\n", "\n", "Raven relies on Hydrological Response Units (HRUs) for its hydrological simulations, which are typically generated using the [BasinMaker](https://hydrology.uwaterloo.ca/basinmaker/) tool, and specifically its `.Generate_HRUs` function. Thus, `xHydro`'s distributed modelling is currently based on the BasinMaker HRU structure and nomenclature, and uses the `BasinMakerExtractor` class of `RavenPy` to extract the necessary HRU attributes from a shapefile.\n", "\n", "Additionally, while BasinMaker will produce attributes such as `Landuse_ID`, these will not be passed on to the `RavenPy` model. Instead, the HRU should contain relevant land use attributes that can be directly mapped to the hydrological model's arguments. For example, for the HBV-EC model, which is currently the only distributed model available in `Raven`, the following attributes are used instead: `LAND_USE_C`, `VEG_C`, and `SOIL_PROF`, which represent land use, vegetation, and soil profile respectively.\n", "\n", "
INFO\n", "\n", "By default, HBV-EC as defined in `RavenPy` only understands a unique `LAND_USE_C` (`LU_ALL`), `VEG_C` (`VEG_ALL`), and `SOIL_PROF` (`DEFAULT_P`). If you want to use different classes, you will need to modify the `model_config` dictionary to include the relevant keys. There is currently no good documentation on how to do this, but you can refer to the class definition of the `HBVEC` model in `ravenpy.config.emulators.hbvec.py`.\n", "\n", "As an example, new vegetation classes can be added by modifying the `VegetationClasses` and `VegetationParameterList` keys with new entries detailing all the vegetation classes and their parameters. The same applies to land use and soil profile classes.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "import geopandas as gpd\n", "import matplotlib.pyplot as plt\n", "import pooch\n", "\n", "from xhydro.testing.helpers import ( # In-house function to get data from the xhydro-testdata repo\n", " deveraux,\n", ")\n", "\n", "df = gpd.read_file(\n", " Path(\n", " deveraux().fetch(\n", " \"ravenpy/hru_subset.zip\",\n", " pooch.Unzip(),\n", " )[0]\n", " ).parents[0]\n", ")\n", "\n", "# Plot the subbasins and land use\n", "f, ax = plt.subplots(1, 2, figsize=(10, 10))\n", "df.plot(column=\"SubId\", ax=ax[0])\n", "ax[0].set_title(\"Subbasins\")\n", "df.plot(\n", " column=\"LAND_USE_C\",\n", " ax=ax[1],\n", " legend=True,\n", " legend_kwds={\"bbox_to_anchor\": (1.05, 1), \"loc\": \"upper left\"},\n", ")\n", "ax[1].set_title(\"Land Use\")\n", "plt.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "id": "10", "metadata": {}, "outputs": [], "source": [ "# To keep this example simple, and until cleaner methods are incorporated in xHydro, we will revert to the default HBVEC model configuration.\n", "# This is not recommended for real applications, as you will likely want to modify the model configuration to suit your needs.\n", "df.loc[:, \"LAND_USE_C\"] = \"LU_ALL\"\n", "df.loc[:, \"VEG_C\"] = \"VEG_ALL\"\n", "df.loc[:, \"SOIL_PROF\"] = \"DEFAULT_P\"" ] }, { "cell_type": "markdown", "id": "11", "metadata": {}, "source": [ "### Formatting Meteorological Data\n", "\n", "
INFO\n", "\n", "If using multiple meteorological stations, it is recommended to add the `Interpolation` argument to `model_config` or the `RavenpyModel` call to control the interpolation algorithm. Raven uses the nearest neighbour method by default, but other options are available:\n", "\n", "- `INTERP_NEAREST_NEIGHBOR` (default) — Nearest neighbor (Voronoi) method \n", "- `INTERP_INVERSE_DISTANCE` — Inverse distance weighting \n", "- `INTERP_INVERSE_DISTANCE_ELEVATION` — Inverse distance weighting with consideration of elevation \n", "- `INTERP_AVERAGE_ALL` — Averages all specified gauge readings \n", "- `INTERP_FROM_FILE [filename]` — Weights for each gauge at each HRU are specified in an external file. This method should work via `xHydro`, but it has not been fully tested.\n", "\n", "
\n", "\n", "
INFO\n", "\n", "When using gridded meteorological data, `xHydro` uses functions from `RavenPy` to compute weights for each grid cell based on the HRU's geometry. \n", "Ensure that the domain of the grid completely covers the watershed.\n", "\n", "
\n", "\n", "The acquisition of raw meteorological data is covered in the [GIS notebook](../gis.ipynb) and [Use Case Example](../use_case.ipynb) notebooks. Therefore, this notebook will use a test dataset." ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "\n", "ds = xr.open_zarr(\n", " Path(\n", " deveraux().fetch(\n", " \"pmp/CMIP.CCCma.CanESM5.historical.r1i1p1f1.day.gn.zarr.zip\",\n", " pooch.Unzip(),\n", " )[0]\n", " ).parents[0]\n", ")\n", "ds_fx = xr.open_zarr(\n", " Path(\n", " deveraux().fetch(\n", " \"pmp/CMIP.CCCma.CanESM5.historical.r1i1p1f1.fx.gn.zarr.zip\",\n", " pooch.Unzip(),\n", " )[0]\n", " ).parents[0]\n", ")\n", "\n", "ds[\"orog\"] = ds_fx[\"orog\"]\n", "ds = ds.drop_vars([\"height\"])\n", "ds[\"pr\"].attrs = {\"units\": \"mm\", \"long_name\": \"precipitation\"}\n", "ds = ds[[\"pr\", \"tas\", \"orog\"]]\n", "ds" ] }, { "cell_type": "markdown", "id": "13", "metadata": {}, "source": [ "Every hydrological model has different requirements when it comes to their input data. In this example, the data above has multiple issues that would be not compatible with the requirements for Raven. For reference on default units expected by Raven, consult [this link](https://ravenpy.readthedocs.io/en/latest/_modules/ravenpy/config/defaults.html#).\n", "\n", "The function `xh.modelling.format_input` can be used to reformat CF-compliant datasets for use in hydrological models." ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "help(xh.modelling.format_input)" ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "import tempfile\n", "notebook_folder = Path(tempfile.TemporaryDirectory().name)\n", "\n", "# You can also use the 'save_as' argument to save the new file(s) in your project folder.\n", "ds_reformatted, config = xh.modelling.format_input(\n", " ds,\n", " \"HBVEC\",\n", " save_as=notebook_folder / \"_data\" / \"meteo_hmr_distributed.nc\",\n", ")\n", "ds_reformatted" ] }, { "cell_type": "markdown", "id": "16", "metadata": {}, "source": [ "While RavenPy does not require a configuration file to accompany the meteorological file, many information must be given to `model_config` to properly instantiate the model. The second output of `format_input` will return the \"meteo_file\", \"data_type\", \"alt_names_meteo\", and \"meteo_station_properties\" entries based on the provided file.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": {}, "outputs": [], "source": [ "config" ] }, { "cell_type": "markdown", "id": "18", "metadata": {}, "source": [ "### Initializing the Model\n", "\n", "The model can now be initialized using the information acquired so far. \n", "Additional entries can be provided to the `model_config` dictionary, as long as they are supported by the emulated Raven model. In particular:\n", "\n", "- The `output_subbasins` key can be used to specify which subbasins to output.\n", "- The `global_parameter` key must have a value for `AVG_ANNUAL_RUNOFF`, which is the average annual runoff in mm/year (with a range of 0-1000 according the Raven's documentation). This value is required for distributed models.\n", "\n", "Refer to the [Raven documentation](https://raven.uwaterloo.ca/Downloads.html) for the most up-to-date information. \n", "Model templates are currently listed in Appendix F, while the available options are described in various chapters.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "19", "metadata": {}, "outputs": [], "source": [ "# The HBVEC model has 21 parameters\n", "parameters = [\n", " -0.15,\n", " 3.5,\n", " 3.0,\n", " 0.07,\n", " 0.4,\n", " 0.8,\n", " 1,\n", " 4.0,\n", " 0.5,\n", " 0.1,\n", " 1,\n", " 5.0,\n", " 4.8,\n", " 0.1,\n", " 1.0,\n", " 22.0,\n", " 0.5,\n", " 0.1,\n", " 0.0,\n", " 1.0,\n", " 1.0,\n", "]\n", "\n", "model_config = {\n", " \"model_name\": \"HBVEC\",\n", " \"workdir\": notebook_folder / \"hbvec_distributed_simulation\",\n", " \"parameters\": parameters,\n", " \"global_parameter\": {\n", " \"AVG_ANNUAL_RUNOFF\": 597\n", " }, # Distributed models require an average annual runoff value at each HRU\n", " \"hru\": df,\n", " \"output_subbasins\": \"all\", # Use \"all\" to output all subbasins\n", " \"start_date\": \"2010-01-02\",\n", " \"end_date\": \"2010-12-31\",\n", " \"Evaporation\": \"PET_HARGREAVES\",\n", " **config,\n", "}" ] }, { "cell_type": "markdown", "id": "20", "metadata": {}, "source": [ "With `model_config` on hand, an instance of the hydrological model can be initialized using `xhydro.modelling.hydrological_model` or the `xhydro.modelling.RavenpyModel` class directly." ] }, { "cell_type": "code", "execution_count": null, "id": "21", "metadata": {}, "outputs": [], "source": [ "hm = xhm.hydrological_model(model_config)\n", "hm" ] }, { "cell_type": "markdown", "id": "22", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Validating the Meteorological Data\n", "\n", "Before executing hydrological models, a few basic checks will be performed automatically. However, users may want to conduct more advanced health checks on the meteorological inputs (e.g., identifying unrealistic values). This can be done using `xhydro.utils.health_checks`. For the full list of available checks, refer to [the 'xscen' documentation](https://xscen.readthedocs.io/en/latest/notebooks/3_diagnostics.html#Health-checks).\n", "\n", "We can use `.get_inputs()` to automatically retrieve the meteorological data. In this example, we'll ensure there are no abnormal meteorological values or sequences of values.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "23", "metadata": {}, "outputs": [], "source": [ "health_checks = {\n", " \"raise_on\": [], # If an entry is not here, it will warn the user instead of raising an exception.\n", " \"flags\": {\n", " \"pr\": { # You can have specific flags per variable.\n", " \"negative_accumulation_values\": {},\n", " \"very_large_precipitation_events\": {},\n", " \"outside_n_standard_deviations_of_climatology\": {\"n\": 5},\n", " \"values_repeating_for_n_or_more_days\": {\"n\": 5},\n", " },\n", " \"tas\": {\n", " \"temperature_extremely_low\": {},\n", " \"temperature_extremely_high\": {},\n", " \"outside_n_standard_deviations_of_climatology\": {\"n\": 5},\n", " \"values_repeating_for_n_or_more_days\": {\"n\": 5},\n", " },\n", " },\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "24", "metadata": {}, "outputs": [], "source": [ "from xclim.core.units import amount2rate\n", "\n", "with hm.get_inputs() as ds_in:\n", " ds_in[\"pr\"] = amount2rate(ds_in[\"pr\"]) # Precipitation in xclim needs to be a flux.\n", "\n", " xh.utils.health_checks(ds_in, **health_checks)" ] }, { "cell_type": "markdown", "id": "25", "metadata": {}, "source": [ "### Executing the Model\n", "\n", "A few basic checks are performed when the `.run()` function is called, before executing the model itself. However, since both RavenPy and Raven will perform a series of checkups themselves, they are kept at a minimum in `xHydro`. If required, a `RavenpyModel.executable` class attribute can be used to point to your own Raven executable instead of the one provided by the `raven-hydro` library in the active Python environment.\n", "\n", "Once the model is executed, `xHydro` will automatically reformat the NetCDF file to bring it closer to CF conventions, ensuring compatibility with other `xHydro` modules. Note that, at this time, only the streamflow variable is reformatted, as the modularity of Raven allows for a wide variety of outputs, and it is not yet clear how to standardize all of them. However, dimensions and coordinates will be standardized across all files." ] }, { "cell_type": "code", "execution_count": null, "id": "26", "metadata": {}, "outputs": [], "source": [ "ds_out = hm.run()\n", "ds_out" ] }, { "cell_type": "code", "execution_count": null, "id": "27", "metadata": {}, "outputs": [], "source": [ "ds_out.isel(subbasin_id=0)[\"q\"].plot()" ] }, { "cell_type": "markdown", "id": "28", "metadata": {}, "source": [ "## Updating the rv* files\n", "\n", "Currently, `RavenPy` provides no straightforward way to open and modify the Raven `.rv*` files. For instance, changing simulation dates or meteorological data directly through the files is not yet supported. Until this feature is added, some basic functions have been integrated into `xHydro`, but should be used with care.\n", "\n", "The basic information, such as `start_date`, `end_date`, and `parameters`, are stored directly in the `RavenpyModel` class and can be manually updated. Similarly, if additional arguments had been given to the model during initialization, they are stored within a dictionary under `RavenpyModel.kwargs`, which can be accessed and modified as needed.\n", "\n", "The observed streamflow, HRU characteristics and meteorological data are stored under the `.qobs`, `.hru` and `.meteo` attributes respectively, but can be much trickier to update, since the associated `RavenPy` commands must be reconstructed again. Therefore, it is strongly recommended to use the `.update_data` method to update these. This function calls upon a subset of the same arguments used when initializing a Raven model:\n" ] }, { "cell_type": "code", "execution_count": null, "id": "29", "metadata": {}, "outputs": [], "source": [ "help(hm.update_data)" ] }, { "cell_type": "markdown", "id": "30", "metadata": {}, "source": [ "That function will only update the `RavenpyModel` class itself, not the files. If possible, it is strongly recommended to use the `create_rv` function to overwrite the existing `.rv*` files with the updated information.\n", "\n", "If this is not possible, some aspects of the model can still be updated using the `.update_config` method:" ] }, { "cell_type": "code", "execution_count": null, "id": "31", "metadata": {}, "outputs": [], "source": [ "help(hm.update_config)" ] }, { "cell_type": "markdown", "id": "32", "metadata": {}, "source": [ "Be very aware that not all updates will be reflected in the `.rv*` files. The last two options especially should be used with caution, as HRU characteristics, such as the subbasin IDs, will *not* be updated. If the HRU within the model has changed, there is currently no way to modify existing files. They should be deleted and recreated using the `.create_rv()` method." ] }, { "cell_type": "markdown", "id": "33", "metadata": {}, "source": [ "## Retrieving additional outputs\n", "\n", "By default, Raven produces multiple output files in addition to the streamflow file, which contain various state variables. However, a major limitation is that these files only cover the watershed as a whole, and not the individual HRUs or river segments. The `:CustomOutput` command can be used to specify additional outputs and their spatial resolution. The easiest way to generate these is to use the `update_config` method.\n", "\n", "Information on how to use the `:CustomOutput` command can be found in the [Raven documentation](https://raven.uwaterloo.ca/Downloads.html)." ] }, { "cell_type": "code", "execution_count": null, "id": "34", "metadata": {}, "outputs": [], "source": [ "hm.update_config(\n", " rvi_commands=[\":CustomOutput DAILY CUMULSUM SNOW BY_HRU\"]\n", ")\n", "hm.run(overwrite=True, return_streamflow=False)" ] }, { "cell_type": "markdown", "id": "35", "metadata": {}, "source": [ "The `.get_outputs()` function can be used to retrieve any of these variables as a `xarray.Dataset`." ] }, { "cell_type": "code", "execution_count": null, "id": "36", "metadata": {}, "outputs": [], "source": [ "help(hm.get_outputs)" ] }, { "cell_type": "code", "execution_count": null, "id": "37", "metadata": {}, "outputs": [], "source": [ "files = hm.get_outputs(\"*\", return_paths=True)\n", "files" ] }, { "cell_type": "code", "execution_count": null, "id": "38", "metadata": {}, "outputs": [], "source": [ "snow = hm.get_outputs(\"SNOW\")\n", "snow" ] }, { "cell_type": "code", "execution_count": null, "id": "39", "metadata": {}, "outputs": [], "source": [ "snow[\"CumulSum_SNOW\"].isel(unit_id=0).plot()" ] }, { "cell_type": "markdown", "id": "40", "metadata": {}, "source": [ "A few important notes regarding the use of `:CustomOutput` in `xHydro`:\n", "- Due to the very high modularity of these outputs, there is currently no standardization of the variable names or their units.\n", "- Furthermore, some of these variables are cumulative, while others are not, and there is currently no way to automatically identify them. It is thus the user's responsibility to ensure that the variables are properly interpreted.\n", "- In an effort to standardize the outputs across different models, the following aggregation levels have been defined. These are noted in a `aggregation_level` attribute in the variable's metadata, and can be used to identify the spatial resolution of the output:\n", " - `ComputationalUnit`: In Raven, this corresponds to the Hydrological Response Units (HRUs).\n", " - `Subbasin`: Following the Raven convention, this corresponds to the immediate drainage area of a river segment, excluding the upstream drainage area.\n", " - `DrainageArea`: This corresponds to the cumulative drainage area of a river segment.\n", "\n", "The `CustomOutput` command allows a control on the spatial resolution of the output (`BY_HRU`, `BY_SUBBASIN`, `BY_DRAINAGE_AREA`), but a `aggregate_outputs` function has additionally been implemented in `xHydro` to allow the aggregation of outputs in post-processing, if needed. Note that this function relies on multiple watershed properties which are deduced from the provided HRU shapefile and will fail if the HRU characteristics do not follow the BasinMaker structure and nomenclature." ] }, { "cell_type": "code", "execution_count": null, "id": "41", "metadata": {}, "outputs": [], "source": [ "help(hm.aggregate_outputs)" ] }, { "cell_type": "code", "execution_count": null, "id": "42", "metadata": {}, "outputs": [], "source": [ "hm.aggregate_outputs(by=\"hru\", to=\"drainage_area\")" ] }, { "cell_type": "code", "execution_count": null, "id": "43", "metadata": {}, "outputs": [], "source": [ "snow_agg = hm.get_outputs(\"SNOW*Drainage\")\n", "snow_agg" ] }, { "cell_type": "markdown", "id": "44", "metadata": {}, "source": [ "## Model Calibration\n", "\n", "Calibrating distributed models is not yet supported by `xHydro` if multiple hydrometric stations are used. Users are encouraged to use the `RavenPy` library directly for this purpose. For single-station calibration, `xHydro` can be used. Refer to the [lumped RavenPy documentation](../hydrological_modelling_raven.ipynb) for more details." ] } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.14.3" } }, "nbformat": 4, "nbformat_minor": 5 }