{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Hydrological modelling - Raven (lumped)\n", "\n", "`xHydro` provides a collection of functions designed to facilitate hydrological modelling, focusing on two key models: [HYDROTEL](https://github.com/INRS-Modelisation-hydrologique/hydrotel) and a suite of models emulated by the [Raven Hydrological Framework](https://raven.uwaterloo.ca/). It is important to note that Raven already possesses an extensive Python library, [RavenPy](https://github.com/CSHS-CWRA/RavenPy), which enables users to build, calibrate, and execute models. `xHydro` wraps some of these functions to support multi-model assessments with HYDROTEL, though users seeking advanced functionalities may prefer to use `RavenPy` directly. \n", "\n", "The primary contribution of `xHydro` to hydrological modelling is thus its support for HYDROTEL, a model that previously lacked a dedicated Python library. This Notebook covers `RavenPy` models, but a similar notebook for `HYDROTEL` is available [here](hydrological_modelling_hydrotel.ipynb).\n", "\n", "## Basic information" ] }, { "cell_type": "code", "execution_count": 1, "id": "1", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "from IPython.display import clear_output\n", "\n", "import xhydro as xh\n", "import xhydro.modelling as xhm\n", "\n", "clear_output(wait=False)" ] }, { "cell_type": "code", "execution_count": 2, "id": "2", "metadata": { "editable": true, "nbsphinx": "hidden", "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "import logging\n", "\n", "logger = logging.getLogger()\n", "logger.setLevel(logging.CRITICAL)" ] }, { "cell_type": "markdown", "id": "3", "metadata": {}, "source": [ "The `xHydro` modelling framework is based on a `model_config` dictionary, which is meant to contain all necessary information to execute a given hydrological model. For example, depending on the model, it can store meteorological datasets directly, paths to datasets (netCDF files or other), csv configuration files, parameters, and basically anything that is required to configure and execute an hydrological model.\n", "\n", "The list of required inputs for the dictionary can be obtained one of two ways. The first is to look at the hydrological model's class, such as `xhydro.modelling.RavenpyModel`. The second is to use the `xh.modelling.get_hydrological_model_inputs` function to get a list of the required keys for a given model, as well as the documentation." ] }, { "cell_type": "code", "execution_count": 3, "id": "4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function get_hydrological_model_inputs in module xhydro.modelling.hydrological_modelling:\n", "\n", "get_hydrological_model_inputs(model_name: str, required_only: bool = False) -> tuple[dict, str]\n", " Get the required inputs for a given hydrological model.\n", "\n", " Parameters\n", " ----------\n", " model_name : str\n", " The name of the hydrological model to use.\n", " Currently supported models are [\"HYDROTEL\", \"Blended\", \"GR4JCN\", \"HBVEC\", \"HMETS\", \"HYPR\", \"Mohyse\", \"SACSMA\"].\n", " required_only : bool\n", " If True, only the required inputs will be returned.\n", "\n", " Returns\n", " -------\n", " dict\n", " A dictionary containing the required configuration for the hydrological model.\n", " str\n", " The documentation for the hydrological model.\n", "\n" ] } ], "source": [ "help(xhm.get_hydrological_model_inputs)" ] }, { "cell_type": "code", "execution_count": 4, "id": "5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'model_name': typing.Literal['Blended', 'GR4JCN', 'HBVEC', 'HMETS', 'HYPR', 'Mohyse', 'SACSMA'] | None,\n", " 'overwrite': bool,\n", " 'workdir': str | os.PathLike | None,\n", " 'executable': str | os.PathLike | None,\n", " 'run_name': str | None,\n", " 'start_date': datetime.datetime | str | None,\n", " 'end_date': datetime.datetime | str | None,\n", " 'parameters': numpy.ndarray | list[float] | None,\n", " 'qobs_file': os.PathLike | str | None,\n", " 'alt_name_flow': str | None,\n", " 'hru': geopandas.geodataframe.GeoDataFrame | dict | os.PathLike | str | None,\n", " 'output_subbasins': typing.Literal['all', 'qobs'] | list[int] | None,\n", " 'minimum_reservoir_area': str | None,\n", " 'meteo_file': os.PathLike | str | None,\n", " 'data_type': list[str] | None,\n", " 'alt_names_meteo': dict | None,\n", " 'meteo_station_properties': dict | None,\n", " 'gridweights': str | os.PathLike | None}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This function can be called to get a list of the keys for a given model, as well as its documentation.\n", "inputs, docs = xhm.get_hydrological_model_inputs(\"GR4JCN\", required_only=False)\n", "inputs" ] }, { "cell_type": "code", "execution_count": 5, "id": "6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Initialize the RavenPy model class.\n", "\n", "Parameters\n", "----------\n", "overwrite : bool\n", " If True, overwrite the existing project files. Default is False.\n", "workdir : str | Path | None\n", " Path to save the .rv files and model outputs. Default is None, which creates a temporary directory.\n", "executable : str | os.PathLike | None, optional\n", " Path to the Raven executable, bypassing RavenPy.\n", " If None (default), the Raven executable from your current Python environment ('raven-hydro') will be used.\n", "run_name : str, optional\n", " Name of the run, which will be used to name the project files. Defaults to \"raven\" if not provided.\n", "model_name : {\"Blended\", \"GR4JCN\", \"HBVEC\", \"HMETS\", \"HYPR\", \"Mohyse\", \"SACSMA\"}, optional\n", " The name of the RavenPy model to run. Only optional if the project files already exist.\n", "start_date : dt.datetime | str, optional\n", " The first date of the simulation. Only optional if the project files already exist.\n", "end_date : dt.datetime | str, optional\n", " The last date of the simulation. Only optional if the project files already exist.\n", "parameters : np.ndarray | list[float], optional\n", " The model parameters for simulation or calibration. Only optional if the project files already exist.\n", "qobs_file : str | Path, optional\n", " Path to the file containing the observed streamflow data.\n", " If there are multiple stations, the file should contain a 'basin_id' variable that identifies the subbasin for each time series.\n", " If a 'station_id' variable is present, it will be used to identify the station.\n", "alt_name_flow : str, optional\n", " Name of the streamflow variable in the observed data file. If not provided, it will be assumed to be \"q\".\n", "hru : gpd.GeoDataFrame | dict | os.PathLike, optional\n", " A GeoDataFrame, or dictionary containing the HRU properties. Only optional if the project files already exist.\n", " For distributed models, it should be readable by ravenpy.extractors.BasinMakerExtractor.\n", " For lumped models, should contain the following variables:\n", " - area: The watershed drainage area, in km².\n", " - elevation: The elevation of the watershed, in meters.\n", " - latitude: The latitude of the watershed centroid.\n", " - longitude: The longitude of the watershed centroid.\n", " - HRU_ID: The ID of the HRU (required for gridded data, optional for station data).\n", " If the meteorological data is gridded, the HRU dataset must also contain a SubId, DowSubId, valid geometry and crs.\n", " If the input is modified, a new shapefile will be created in the workdir/weights subdirectory.\n", "output_subbasins : {\"all\", \"qobs\"} | list[int] | None, optional\n", " If \"all\", all subbasins will be outputted. If \"qobs\", only the subbasins with observed flow will be outputted.\n", " Leave as None to use the value as defined in the HRU file ('Has_Gauge' column). Only applicable for distributed HBVEC models.\n", "minimum_reservoir_area : str, optional\n", " Quantified string (e.g. \"20 km2\") representing the minimum lake area to consider the lake explicitly as a reservoir.\n", " If not provided, all lakes with the 'HRU_IsLake' column set to 1 in the HRU file will be considered as reservoirs.\n", " Note that 'reservoirs' in Raven can also refer to natural lakes with weir-like outflows.\n", " Only applicable for distributed HBVEC models.\n", "meteo_file : str | Path, optional\n", " Path to the file containing the observed meteorological data. Only optional if the project files already exist.\n", " The meteorological data can be either station or gridded data. Use the 'xhydro.modelling.format_input' function to ensure the data\n", " is in the correct format. Unless the input is a single station accompanied by 'meteo_station_properties', the file should contain\n", " the following coordinates:\n", " - elevation: The elevation of the station / grid cell, in meters.\n", " - latitude: The latitude of the station / grid cell centroid.\n", " - longitude: The longitude of the station / grid cell centroid.\n", "data_type : list[str], optional\n", " The list of types of data provided to Raven in the meteorological file. Only optional if the project files already exist.\n", " See https://github.com/CSHS-CWRA/RavenPy/blob/master/src/ravenpy/config/conventions.py for the list of available types.\n", "alt_names_meteo : dict, optional\n", " A dictionary that allows users to link the names of meteorological variables in their dataset to Raven-compliant names.\n", " The keys should be the Raven names as listed in the data_type parameter.\n", "meteo_station_properties : dict, optional\n", " Additional properties of the weather stations providing the meteorological data. Only required if absent from the 'meteo_file'.\n", " For single stations, the format is {\"ALL\": {\"elevation\": elevation, \"latitude\": latitude, \"longitude\": longitude}}.\n", " This has not been tested for multiple stations or gridded data.\n", "gridweights : str | Path | None\n", " If using gridded meteorological data, path to a text file containing the weights linking the grid cells to the HRUs.\n", " If None, the weights will be computed using ravenpy.extractors.GridWeightExtractor and saved in a 'weights' subdirectory\n", " of the project folder, using a \"{meteo_file}_vs_{hru_file}_weights.txt\" pattern.\n", "\\*\\*kwargs : dict, optional\n", " Additional parameters to pass to the RavenPy emulator, to modify the default modules used by a given hydrological model.\n", " Typical entries include RainSnowFraction, Evaporation, GlobalParameters, etc.\n", " See https://raven.uwaterloo.ca/Downloads.html for the latest Raven documentation. Currently, model templates are listed in Appendix F.\n", "\n" ] } ], "source": [ "print(docs)" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "HYDROTEL and Raven vary in terms of required inputs and available functions, but an effort will be made to standardize the outputs as much as possible. Currently, all models include the following three functions:\n", "\n", "- `.run()`: Executes the model, reformats the outputs to be compatible with analysis tools in `xHydro`, and returns the simulated streamflow as a `xarray.Dataset`.\n", " - The streamflow variable will be named `q` and will have units of `m3 s-1`.\n", " - For 1D data (such as hydrometric stations), the corresponding dimension in the dataset will be identified by the `cf_role: timeseries_id` attribute.\n", " \n", "- `.get_inputs()`: Retrieves the meteorological inputs used by the model.\n", "\n", "- `.get_outputs()`: Retrieves the simulated outputs from the model.\n", " - Use `.get_outputs(\"q\")` to obtain the simulated streamflow as a `xarray.Dataset`.\n", "\n", "- `.standardize_outputs()`: Standardizes the outputs to ensure consistency across different models, facilitating comparison and analysis. This function is used by default in the `.run()` method, but can also be called separately if needed." ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "## Initializing and running a calibrated model\n", "Raven requires several `.rv*` files to control various aspects such as meteorological inputs, watershed characteristics, and more. If the project directory already exists and contains data, `xHydro` will prepare the model for execution without overwriting existing `.rv*` files—unless the `overwrite` argument is explicitly set to `True`. To force overwriting of these files, you can thus either:\n", "\n", "- Set `overwrite=True` in the `model_config` when instantiating the model\n", "- Use the `.create_rv(overwrite=True)` method on the instantiated model.\n", "\n", "This Notebook will focus on lumped RavenPy models. For distributed models, refer to the [Raven distributed modelling notebook](pavics_notebooks/hydrological_modelling_raven_distributed.ipynb).\n", "\n", "### Acquiring HRU Data\n", "\n", "Raven relies on Hydrological Response Units (HRUs) for its hydrological simulations. For lumped models, only one HRU can be used at a time.\n", "\n", "If using station-based meteorological data, the required HRU attributes are minimal:\n", "\n", "- `area`: Watershed drainage area (km²) \n", "- `elevation`: Watershed elevation (m) \n", "- `latitude`: Latitude of the watershed centroid \n", "- `longitude`: Longitude of the watershed centroid \n", "\n", "If using gridded meteorological data, additional attributes are required, but `xHydro` will use default values for those that are not provided (except for the geometry):\n", "\n", "- `HRU_ID`: Unique identifier for the HRU (set to `1` for lumped models) \n", "- `SubId`: Subbasin ID (set to `1` for lumped models) \n", "- `DowSubId`: Downstream Subbasin ID (set to `-1` for lumped models) \n", "- A valid geometry and coordinate reference system (`crs`) \n", "\n", "HRUs can be represented as either a `geopandas.GeoDataFrame` or a Python `dict`. To assist with HRU creation, you can use the `xhydro.gis.watershed_to_raven_hru` function, which will extract the necessary information from functions described in the [GIS notebook](gis.ipynb).\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function watershed_to_raven_hru in module xhydro.gis:\n", "\n", "watershed_to_raven_hru(\n", " watershed: gpd.GeoDataFrame | tuple | str | os.PathLike,\n", " *,\n", " unique_id: str | None = None,\n", " projected_crs: int | str | None = 'NAD83',\n", " **kwargs\n", ") -> gpd.GeoDataFrame\n", " Extract the necessary properties for Raven hydrological models.\n", "\n", " Parameters\n", " ----------\n", " watershed : gpd.GeoDataFrame | tuple | str | Path\n", " The input, which is either:\n", " - A gpd.GeoDataFrame containing watershed polygons with a defined .crs attribute.\n", " - The path to such a gpd.GeoDataFrame.\n", " - Coordinates (longitude, latitude) for the location from where watershed delineation will be conducted.\n", " unique_id : str, optional\n", " The column name in the GeoDataFrame that serves as a unique identifier.\n", " Ignored if the input is a coordinate tuple.\n", " projected_crs : int | str\n", " The projected coordinate reference system (crs) to utilize for calculations, such as determining watershed area.\n", " If a string is provided, it should be a valid Geodetic CRS for the `gpd.estimate_utm_crs()` method.\n", " If None, the function will use the `gpd.estimate_utm_crs()` default (WGS 84).\n", " Default is an estimated CRS based on NAD83.\n", " \\*\\*kwargs : dict\n", " Additional keyword arguments passed to the `surface_properties` function.\n", "\n", " Returns\n", " -------\n", " gpd.GeoDataFrame\n", " Output GeoDataFrame containing the watershed properties required for Raven hydrological models.\n", "\n", " Notes\n", " -----\n", " Gridded meteorological data in RavenPy requires the `SubId` and `DowSubId` columns to be set, but this cannot currently be\n", " automatically calculated. Therefore, the function sets `SubId` to 1 and `DowSubId` to -1 by default, which is\n", " correct for lumped hydrological models, but will not be appropriate for distributed models. Until this is fixed, only a\n", " single watershed can be delineated.\n", "\n", " Furthermore, still for gridded meteorological data, RavenPy requires a shapefile with a valid geometry. Until a method\n", " is implemented to convert the geometry to something valid in xarray, the function will only return GeoDataFrames.\n", "\n" ] } ], "source": [ "help(xh.gis.watershed_to_raven_hru)" ] }, { "cell_type": "code", "execution_count": 7, "id": "10", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/rondeau/projets/xhydro/src/xhydro/gis.py:201: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.\n", "\n", "/home/rondeau/projets/xhydro/src/xhydro/gis.py:202: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.\n", "\n" ] }, { "data": { "text/html": [ "
| \n", " | HRU_ID | \n", "area | \n", "latitude | \n", "longitude | \n", "elevation | \n", "SubId | \n", "DowSubId | \n", "geometry | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "7120384690 | \n", "755.976896 | \n", "45.948568 | \n", "-71.801471 | \n", "275.822235 | \n", "1 | \n", "-1 | \n", "POLYGON ((-71.60638 45.77973, -71.61029 45.782... | \n", "
<xarray.Dataset> Size: 132kB\n",
"Dimensions: (time: 6576)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 53kB 1981-12-31 1982-01-01 ... 2000-01-01\n",
" altitude int64 8B 450\n",
" lat int64 8B 46\n",
" lon int64 8B -72\n",
"Data variables:\n",
" tmin (time) float32 26kB ...\n",
" tmax (time) float32 26kB ...\n",
" pr (time) float32 26kB ...\n",
"Attributes: (12/31)\n",
" GRIB_NV: 0\n",
" GRIB_Nx: 1440\n",
" GRIB_Ny: 721\n",
" GRIB_cfName: unknown\n",
" GRIB_cfVarName: t2m\n",
" GRIB_dataType: an\n",
" ... ...\n",
" GRIB_typeOfLevel: surface\n",
" GRIB_units: degC\n",
" long_name: 2 metre temperature\n",
" standard_name: unknown\n",
" units: degC\n",
" grid_mapping: crs<xarray.Dataset> Size: 132kB\n",
"Dimensions: (station_id: 1, time: 6576)\n",
"Coordinates:\n",
" * station_id (station_id) <U1 4B '0'\n",
" elevation (station_id) int64 8B 450\n",
" latitude (station_id) int64 8B 46\n",
" longitude (station_id) int64 8B -72\n",
" * time (time) datetime64[ns] 53kB 1981-12-31 1982-01-01 ... 2000-01-01\n",
"Data variables:\n",
" tasmin (station_id, time) float32 26kB -14.84 -6.52 ... -26.85 -15.48\n",
" tasmax (station_id, time) float32 26kB -5.316 -0.0699 ... -14.92 -15.48\n",
" pr (station_id, time) float32 26kB 0.3767 9.103 ... 0.07919 0.01176\n",
"Attributes: (12/31)\n",
" GRIB_NV: 0\n",
" GRIB_Nx: 1440\n",
" GRIB_Ny: 721\n",
" GRIB_cfName: unknown\n",
" GRIB_cfVarName: t2m\n",
" GRIB_dataType: an\n",
" ... ...\n",
" GRIB_typeOfLevel: surface\n",
" GRIB_units: degC\n",
" long_name: 2 metre temperature\n",
" standard_name: unknown\n",
" units: degC\n",
" grid_mapping: crs<xarray.Dataset> Size: 12kB\n",
"Dimensions: (time: 730)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 6kB 1990-01-01 ... 1991-12-31\n",
" subbasin_id <U1 4B ...\n",
" elevation float32 4B ...\n",
" drainage_area float64 8B ...\n",
" centroid_longitude float64 8B ...\n",
" centroid_latitude float64 8B ...\n",
"Data variables:\n",
" q (time) float64 6kB ...\n",
"Attributes:\n",
" Conventions: CF-1.6\n",
" featureType: timeSeries\n",
" history: Created on 2026-04-01T12:54:24 by Raven 4.1\n",
" description: Standard Output\n",
" references: Craig J.R. and the Raven Development Team Raven user's ...\n",
" model_id: GR4JCN\n",
" Raven_version: 4.1\n",
" RavenPy_version: 0.20.0<xarray.Dataset> Size: 117kB\n",
"Dimensions: (time: 730)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 6kB 1990-01-01 ... 1991-...\n",
" elevation float32 4B ...\n",
" drainage_area float64 8B ...\n",
" centroid_longitude float64 8B ...\n",
" centroid_latitude float64 8B ...\n",
"Data variables: (12/19)\n",
" rainfall (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" snowfall (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" channel_storage (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" reservoir_storage (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" rivulet_storage (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" Surface Water (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" ... ...\n",
" Convolution Storage[0] (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" Convolution Storage[1] (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" total (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" cum_input (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" cum_outflow (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
" MB_error (time) float64 6kB dask.array<chunksize=(730,), meta=np.ndarray>\n",
"Attributes:\n",
" Conventions: CF-1.6\n",
" featureType: timeSeries\n",
" history: Created on 2026-04-01T12:54:24 by Raven 4.1\n",
" description: Standard Output\n",
" references: Craig J.R. and the Raven Development Team Raven user's ...\n",
" model_id: GR4JCN\n",
" Raven_version: 4.1\n",
" RavenPy_version: 0.20.0<xarray.Dataset> Size: 12kB\n",
"Dimensions: (time: 730)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 6kB 1990-01-01 ... 1991-12-31\n",
" subbasin_id <U1 4B ...\n",
" elevation float32 4B ...\n",
" drainage_area float64 8B ...\n",
" centroid_longitude float64 8B ...\n",
" centroid_latitude float64 8B ...\n",
"Data variables:\n",
" q (time) float64 6kB ...\n",
"Attributes:\n",
" Conventions: CF-1.6\n",
" featureType: timeSeries\n",
" history: Created on 2026-04-01T12:54:33 by Raven 4.1\n",
" description: Standard Output\n",
" references: Craig J.R. and the Raven Development Team Raven user's ...\n",
" model_id: GR4JCN\n",
" Raven_version: 4.1\n",
" RavenPy_version: 0.20.0