2. Module de modélisation hydrologique¶
INFO xhydro
fournit des outils pour exécuter et calibrer des modèles hydrologiques, mais ne préparera pas le modèle lui-même. Cela devrait être fait au préalable.
xhydro
fournit une collection de fonctions qui peuvent servir de point d’entrée principal pour la modélisation hydrologique. L’ensemble du framework est basé sur la fonction xh.modelling.hydrological_model
et son dictionnaire model_config
, qui contient toutes les informations nécessaires pour exécuter le modèle hydrologique donné. Par exemple, selon le modèle, il peut stocker directement des ejeux de données météorologiques, des chemins d’accès aux jeux de données (fichiers netCDF ou autres), des fichiers de configuration csv, des paramètres et, fondamentalement, tout ce qui est nécessaire pour configurer et exécuter un modèle hydrologique.
It then becomes the User’s responsibility to ensure that all required information for a given model are provided in the model_config
object, both in the data preparation stage and in the hydrological model implementation. This can be addressed by calling the xh.modelling.get_hydrological_model_inputs
function to get a list of the required keys for a given model, as well as the documentation. Parameters for that function are:
model_name
: As listed below.required_only
: Whether to return all possible inputs, or only the required ones.
Currently available models are:
Hydrotel
Raven-emulated models:
Blended
,GR4JCN
,HBVEC
,HMETS
,HYPR
,Mohyse
,SACSMA
[1]:
import xhydro as xh
import xhydro.modelling as xhm
# This function can be called to get a list of the keys for a given model, as well as its documentation.
inputs, docs = xhm.get_hydrological_model_inputs("Hydrotel", required_only=False)
inputs
/home/docs/checkouts/readthedocs.org/user_builds/xhydro-fr/conda/latest/lib/python3.12/site-packages/esmpy/interface/loadESMF.py:94: VersionWarning: ESMF installation version 8.8.0, ESMPy version 8.8.0b0
warnings.warn("ESMF installation version {}, ESMPy version {}".format(
ERROR 1: PROJ: proj_create_from_database: Open of /home/docs/checkouts/readthedocs.org/user_builds/xhydro-fr/conda/latest/share/proj failed
[1]:
{'model_name': 'Hydrotel',
'project_dir': str | os.PathLike,
'project_file': str,
'executable': str | os.PathLike,
'project_config': dict | None,
'simulation_config': dict | None,
'output_config': dict | None,
'use_defaults': bool}
[2]:
print(docs)
Class to handle Hydrotel simulations.
Parameters
----------
project_dir : str or os.PathLike
Path to the project folder.
project_file : str
Name of the project file (e.g. 'projet.csv').
executable : str or os.PathLike
Command to execute Hydrotel.
On Windows, this should be the path to Hydrotel.exe.
project_config : dict, optional
Dictionary of configuration options to overwrite in the project file.
simulation_config : dict, optional
Dictionary of configuration options to overwrite in the simulation file (simulation.csv).
output_config : dict, optional
Dictionary of configuration options to overwrite in the output file (output.csv).
use_defaults : bool
If True, use default configuration options loaded from xhydro/modelling/data/hydrotel_defaults/.
If False, read the configuration options directly from the files in the project folder.
Notes
-----
At minimum, the project folder must already exist when this function is called
and either 'use_defaults' must be True or 'SIMULATION COURANTE' must be specified
as a keyword argument in 'project_config'.
Hydrological models can differ from one another in terms of required inputs and available functions, but an effort will be made to homogenize them as much as possible as new models get added. Currently, all models have these 3 functions:
.run()
which will execute the model, reformat the outputs to be compatible with analysis tools inxhydro
, then return the simulated streamflows as axarray.Dataset
.The streamflow will be called
streamflow
and have units inm3 s-1
.In the case of 1D data (such as hydrometric stations), that dimension in the dataset will be identified through a
cf_role: timeseries_id
attribute.
.get_inputs()
to retrieve the meteorological inputs..get_streamflow()
to retrieve the simulated streamflow.
2.1. Acquiring and formatting meteorological data¶
The acquisition of raw meteorological and elevation data using the GIS module and libraries such as xdatasets
is covered in the GIS notebook. Therefore, this notebook will use a test dataset.
[4]:
import xarray as xr
from xhydro.testing.helpers import deveraux
D = deveraux()
# This notebook will use ERA5 data for a small watershed in Eastern Quebec, along with faked elevation data.
# Streamflow file (1 file - Hydrotel driven by BCC-CSM-1.1(m))
meteo_file = D.fetch("hydro_modelling/ERA5_testdata.nc")
ds = xr.open_dataset(meteo_file)
ds
[4]:
<xarray.Dataset> Size: 69kB Dimensions: (lat: 3, lon: 5, time: 366) Coordinates: spatial_agg <U7 28B ... timestep <U1 4B ... * lat (lat) float64 24B 48.5 48.75 49.0 * lon (lon) float64 40B -65.5 -65.25 -65.0 -64.75 -64.5 HYBAS_ID int64 8B ... * time (time) datetime64[ns] 3kB 1981-01-01 1981-01-02 ... 1982-01-01 source <U29 116B ... z (lat, lon) float32 60B ... Data variables: tasmax (time, lat, lon) float32 22kB ... tasmin (time, lat, lon) float32 22kB ... pr (time, lat, lon) float32 22kB ... Attributes: (12/30) GRIB_NV: 0 GRIB_Nx: 1440 GRIB_Ny: 721 GRIB_cfName: unknown GRIB_cfVarName: t2m GRIB_dataType: an ... ... GRIB_totalNumber: 0 GRIB_typeOfLevel: surface GRIB_units: K long_name: 2 metre temperature standard_name: unknown units: K
Every hydrological model has different requirements when it comes to their input data. In this example, the data variables have units (temperatures in °K
and precipitation in m
) that would not be compatible with the requirements for the Hydrotel model.
Thus, the function xh.modelling.format_input
can be used to reformat CF-compliant datasets for use in hydrological models. Note that this function currently only supports “Hydrotel”.
[5]:
print(xh.modelling.format_input.__doc__)
Reformat CF-compliant meteorological data for use in hydrological models.
Parameters
----------
ds : xr.Dataset
A dataset containing the meteorological data. See the "Notes" section for more information on the expected format.
model : str
The name of the hydrological model to use.
Currently supported models are: "Hydrotel".
convert_calendar_missing : float, str, dict, optional
Upon conversion of the calendar, missing values will be filled with this value. Default is np.nan.
If the value is 'interpolate', the new dates will be linearly interpolated over time.
A dictionary can be used to specify a different fill value for each variable.
Keys should be the standard names of the variables (first entry in the list of names in the "Notes" section).
save_as : str, optional
Where to save the reformatted data. If None, the data will not be saved.
This can be useful when multiple files are needed for a single model run (e.g. Hydrotel needs a configuration file).
\*\*kwargs : dict
Additional keyword arguments to pass to the save function.
Returns
-------
tuple[xr.Dataset, dict]
The reformatted dataset and, if applicable, the configuration for the model.
Notes
-----
The input dataset should be CF-compliant.
This function will attempt to detect the variables based on the standard_name attribute, the cell_methods attribute, or the variable name
(AMIP column) taken from https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html.
Specifically:
- If using 1D time series, the station dimension should have an attribute `cf_role` set to "timeseries_id".
- Units don't need to be canonical, but they should be convertible to the expected units and be understood by `xclim`.
- The following attempts will be made to detect the variables:
- Longitude:
- standard_name: "longitude"
- variable name: "lon", "longitude"
- Latitude:
- standard_name: "latitude"
- variable name: "lat", "latitude"
- Elevation:
- standard_name: "surface_altitude"
- variable name: "orog", "z", "altitude", "elevation", "height"
- Precipitation:
- standard_name: "*precipitation*" (e.g. "lwe_thickness_of_precipitation_amount")
- variable name: "pr", "precip", "precipitation"
- Maximum temperature:
- standard_name: "air_temperature"
- cell_methods: "time: maximum"
- variable name: "tasmax", "tmax", "temperature_max"
- Minimum temperature:
- standard_name: "air_temperature"
- cell_methods: "time: minimum"
- variable name: "tasmin", "tmin", "temperature_min"
Hydrotel requires the following variables: ["longitude", "latitude", "altitude", "time", "tasmax", "tasmin", "pr"].
[7]:
# For Hydrotel, the function will reproject 2D grids into a single dimension 'station', ensure that temperature and precipitation data are in '°C' and 'mm' respectively,
# and that the time axis is in 'days since 1970-01-01 00:00:00', among other changes.
# You can also use the 'save_as' argument to save the new file(s) in your project folder.
ds_reformatted, config = xh.modelling.format_input(
ds,
"Hydrotel",
save_as=notebook_folder / "_data" / "example_hydrotel" / "meteo" / "ERA5.nc",
)
[8]:
ds_reformatted
[8]:
<xarray.Dataset> Size: 38kB Dimensions: (time: 366, station: 8) Coordinates: spatial_agg <U7 28B 'polygon' timestep <U1 4B 'D' HYBAS_ID int64 8B 7120035110 * time (time) int64 3kB 4018 4019 4020 4021 ... 4380 4381 4382 4383 source <U29 116B 'era5_reanalysis_single_levels' orog (station) float32 32B 100.0 100.0 100.0 ... 100.0 100.0 100.0 lat (station) float64 64B 48.5 48.5 48.75 ... 48.75 48.75 49.0 lon (station) float64 64B -65.25 -65.0 -65.5 ... -64.75 -64.5 -65.5 * station (station) int64 64B 0 1 2 3 4 5 6 7 Data variables: tasmax (time, station) float32 12kB -9.085 -8.085 ... -2.062 -4.993 tasmin (time, station) float32 12kB -15.94 -14.57 ... -13.86 -18.4 pr (time, station) float32 12kB 0.1025 0.1063 ... 0.5474 0.9813 Attributes: (12/30) GRIB_NV: 0 GRIB_Nx: 1440 GRIB_Ny: 721 GRIB_cfName: unknown GRIB_cfVarName: t2m GRIB_dataType: an ... ... GRIB_totalNumber: 0 GRIB_typeOfLevel: surface GRIB_units: K long_name: 2 metre temperature standard_name: unknown units: K
[9]:
# Hydrotel also requires a configuration file, which will also be produced by this function.
config
[9]:
{'TYPE (STATION/GRID/GRID_EXTENT)': 'STATION',
'STATION_DIM_NAME': 'station',
'LATITUDE_NAME': 'lat',
'LONGITUDE_NAME': 'lon',
'ELEVATION_NAME': 'orog',
'TIME_NAME': 'time',
'TMIN_NAME': 'tasmin',
'TMAX_NAME': 'tasmax',
'PRECIP_NAME': 'pr'}
2.2. Initializing the model¶
L’exemple suivant utilisera le modèle Hydrotel. Il est plus complexe, avec la plupart de ses paramètres cachés dans les fichiers de configuration, mais xhydro
peut être utilisé pour facilement mettre à jour les fichiers de configuration, valider le répertoire du projet et les entrées météorologiques, exécuter le modèle et reformater les sorties pour être plus conformes aux conventions CF et aux autres fonctions de xhydro
.
Notez que xhydro
ne prépare pas le répertoire du projet lui-même, ce qui devrait être fait au préalable. Ce que fait la classe, lors du lancement d’une nouvelle instance de xhydro.modelling.Hydrotel
, c’est autoriser le contrôle sur les entrées situées dans les trois fichiers de configuration principaux : le fichier de projet, simulation.csv
et output.csv
. Les autres arguments pour la classe, obtenus à partir de get_hydrological_model_inputs
, sont listés ci-dessus. A tout moment après l’initialisation de la classe, update_config()
peut être appelée pour mettre à jour les trois fichiers de configuration. Lorsqu’elle est appelée, cette fonction écrasera les fichiers CSV sur le disque.
[10]:
import os
from pathlib import Path
# The executable depends on the platform
if os.name == "nt":
executable = "path/to/Hydrotel.exe"
else:
executable = "hydrotel"
# Prepare the model configuration options
model_config = {
"model_name": "Hydrotel",
"project_dir": Path(notebook_folder) / "_data" / "example_hydrotel",
"project_file": "projet.csv",
"simulation_config": {
"DATE DEBUT": "1981-01-01",
"DATE FIN": "1981-12-31",
"FICHIER STATIONS METEO": "meteo/ERA5.nc",
"PAS DE TEMPS": 24,
},
"output_config": {"TRONCONS": 1, "DEBITS_AVAL": 1},
"use_defaults": True,
"executable": executable,
}
Pour HYDROTEL, DATE DEBUT, DATE FIN et PAS DE TEMPS
devront toujours être spécifiés. Ils doivent donc être ajoutés à simulation_config
s’ils n’existent pas déjà dans simulation.csv
. De plus, soit FICHIER STATIONS METEO (fichier des stations météorologiques)
soit FICHIER GRILLE METEO (fichier de grille météorologique)
doivent être spécifiés pour guider le modèle vers les données météorologiques.
Si vous utilisez les valeurs par défaut, le débit de tous les tronçons de rivière sera affiché. Vous pouvez modifier output.csv
pour changer ce comportement.
[11]:
# Note that xhm.Hydrotel(**model_config) could also be used to initiate the model.
ht = xhm.hydrological_model(model_config)
print(f"Simulation directory, taken from the project file: '{ht.simulation_dir}'\n")
print(f"Project configuration: '{ht.project_config}'\n")
print(f"Simulation configuration: '{ht.simulation_config}'\n")
print(f"Output configuration: '{ht.output_config}'")
Simulation directory, taken from the project file: '/home/docs/checkouts/readthedocs.org/user_builds/xhydro-fr/checkouts/latest/docs/notebooks/_data/example_hydrotel/simulation/simulation'
Project configuration: '{'PROJET HYDROTEL VERSION': '', 'FICHIER ALTITUDE': 'physitel/altitude.tif', 'FICHIER PENTE': 'physitel/pente.tif', 'FICHIER ORIENTATION': 'physitel/orientation.tif', 'FICHIER ZONE': 'physitel/uhrh.tif', 'FICHIER NOEUD': 'physitel/noeuds.nds', 'FICHIER TRONCON': 'physitel/troncon.trl', 'FICHIER PIXELS': 'physitel/point.rdx', 'FICHIER MILIEUX HUMIDES PROFONDEUR TRONCONS': '', 'SIMULATION COURANTE': 'simulation'}'
Simulation configuration: '{'SIMULATION HYDROTEL VERSION': '', 'FICHIER OCCUPATION SOL': 'physitel/occupation_sol.cla', 'FICHIER PROPRIETE HYDROLIQUE': 'physitel/proprietehydrolique.sol', 'FICHIER TYPE SOL COUCHE1': 'physitel/type_sol.cla', 'FICHIER TYPE SOL COUCHE2': 'physitel/type_sol.cla', 'FICHIER TYPE SOL COUCHE3': 'physitel/type_sol.cla', 'COEFFICIENT ADDITIF PROPRIETE HYDROLIQUE': '0', 'FICHIER INDICE FOLIERE': 'physio/ind_fol.def', 'FICHIER PROFONDEUR RACINAIRE': 'physio/pro_rac.def', 'FICHIER GRILLE METEO': '', 'FICHIER STATIONS METEO': 'meteo/ERA5.nc', 'FICHIER STATIONS HYDRO': '', 'PREVISION METEO': '0', 'FICHIER GRILLE PREVISION': '', 'DATE DEBUT PREVISION': '', 'DATE DEBUT': '1981-01-01 00:00', 'DATE FIN': '1981-12-31 00:00', 'PAS DE TEMPS': 24, 'TRONCON EXUTOIRE': '1', 'TRONCONS DECONNECTER': 'off', 'NOM FICHIER CORRECTIONS': 'correction.csv', 'LECTURE ETAT FONTE NEIGE': '', 'LECTURE ETAT TEMPERATURE DU SOL': '', 'LECTURE ETAT BILAN VERTICAL': '', 'LECTURE ETAT RUISSELEMENT SURFACE': '', 'LECTURE ETAT ACHEMINEMENT RIVIERE': '', 'ECRITURE ETAT FONTE NEIGE': '', 'ECRITURE ETAT TEMPERATURE DU SOL': '', 'ECRITURE ETAT BILAN VERTICAL': '', 'ECRITURE ETAT RUISSELEMENT SURFACE': '', 'ECRITURE ETAT ACHEMINEMENT RIVIERE': '', 'REPERTOIRE ECRITURE ETAT FONTE NEIGE': 'etat', 'REPERTOIRE ECRITURE ETAT TEMPERATURE DU SOL': 'etat', 'REPERTOIRE ECRITURE ETAT BILAN VERTICAL': 'etat', 'REPERTOIRE ECRITURE ETAT RUISSELEMENT SURFACE': 'etat', 'REPERTOIRE ECRITURE ETAT ACHEMINEMENT RIVIERE': 'etat', 'INTERPOLATION DONNEES': 'THIESSEN', 'FONTE DE NEIGE': 'DEGRE JOUR MODIFIE', 'TEMPERATURE DU SOL': '', 'EVAPOTRANSPIRATION': 'HYDRO-QUEBEC', 'BILAN VERTICAL': 'BV3C', 'RUISSELEMENT': 'ONDE CINEMATIQUE', 'ACHEMINEMENT RIVIERE': 'ONDE CINEMATIQUE MODIFIEE', 'MILIEUX HUMIDES ISOLES': '0', 'MILIEUX HUMIDES RIVERAINS': '0', 'FICHIER DE PARAMETRE GLOBAL': '0', 'LECTURE INTERPOLATION DONNEES': 'lecture_interpolation.csv', 'THIESSEN': 'thiessen.csv', 'MOYENNE 3 STATIONS': 'moyenne_3_stations.csv', 'LECTURE FONTE NEIGE': 'lecture_fonte_neige.csv', 'DEGRE JOUR MODIFIE': 'degre_jour_modifie.csv', 'LECTURE TEMPERATURE DU SOL': 'lecture_tempsol.csv', 'RANKINEN': 'rankinen.csv', 'THORSEN': 'thorsen.csv', 'LECTURE EVAPOTRANSPIRATION': 'lecture_etp.csv', 'HYDRO-QUEBEC': 'hydro_quebec.csv', 'THORNTHWAITE': 'thornthwaite.csv', 'LINACRE': 'linacre.csv', 'PENMAN': 'penman.csv', 'PRIESTLAY-TAYLOR': 'priestlay_taylor.csv', 'PENMAN-MONTEITH': 'penman_monteith.csv', 'ETP-MC-GUINESS': 'etp-mc-guiness.csv', 'LECTURE BILAN VERTICAL': 'lecture_bilan_vertical.csv', 'BV3C': 'bv3c.csv', 'CEQUEAU': 'cequeau.csv', 'LECTURE RUISSELEMENT SURFACE': 'lecture_ruisselement.csv', 'ONDE CINEMATIQUE': 'onde_cinematique.csv', 'LECTURE ACHEMINEMENT RIVIERE': 'lecture_acheminement.csv', 'ONDE CINEMATIQUE MODIFIEE': 'onde_cinematique_modifiee.csv', 'FICHIER MILIEUX HUMIDES ISOLES': '', 'FICHIER MILIEUX HUMIDES RIVERAINS': ''}'
Output configuration: '{'TRONCONS': 1, 'TMIN': '0', 'TMAX': '0', 'TMIN_JOUR': '0', 'TMAX_JOUR': '0', 'PLUIE': '0', 'NEIGE': '0', 'APPORT': '0', 'COUVERT_NIVAL': '0', 'HAUTEUR_NEIGE': '0', 'ALBEDO_NEIGE': '0', 'PROFONDEUR_GEL': '0', 'ETP': '0', 'ETR1': '0', 'ETR2': '0', 'ETR3': '0', 'ETR_TOTAL': '0', 'PRODUCTION_BASE': '0', 'PRODUCTION_HYPO': '0', 'PRODUCTION_SURF': '0', 'Q12': '0', 'Q23': '0', 'THETA1': '0', 'THETA2': '0', 'THETA3': '0', 'APPORT_LATERAL': '0', 'DEBITS_AMONT': '0', 'DEBITS_AVAL': 1, 'FICHIERS_ETATS_SEPARATEUR': ';', 'SEPARATEUR': ';', 'OUTPUT_NETCDF': '1'}'
2.3. Validation des données météorologiques¶
Quelques contrôles de base seront automatiquement effectués avant d’exécuter des modèles hydrologiques, mais un utilisateur avancé pourrait souhaiter effectuer des contrôles de santé plus avancés (par exemple, des entrées météorologiques irréalistes). Ceci est possible grâce à l’utilisation de xhydro.utils.health_checks
. Consultez la documentation “xscen” <https://xscen.readthedocs.io/en/latest/notebooks/3_diagnostics.html#Health-checks>`__ pour la liste complète des vérifications possibles.
Dans cet exemple, nous veillerons à ce qu’il n’y ait pas de valeurs météorologiques ou de séquences de valeurs anormales. Étant donné que les données utilisées pour cet exemple sont fausses, cette fonction déclenchera certains signaux d’alarme.
[12]:
health_checks = {
"raise_on": [], # If an entry is not here, it will warn the user instead of raising an exception.
"flags": {
"pr": { # You can have specific flags per variable.
"negative_accumulation_values": {},
"very_large_precipitation_events": {},
"outside_n_standard_deviations_of_climatology": {"n": 5},
"values_repeating_for_n_or_more_days": {"n": 5},
},
"tasmax": {
"tasmax_below_tasmin": {},
"temperature_extremely_low": {},
"temperature_extremely_high": {},
"outside_n_standard_deviations_of_climatology": {"n": 5},
"values_repeating_for_n_or_more_days": {"n": 5},
},
"tasmin": {
"temperature_extremely_low": {},
"temperature_extremely_high": {},
"outside_n_standard_deviations_of_climatology": {"n": 5},
"values_repeating_for_n_or_more_days": {"n": 5},
},
},
}
[13]:
from xclim.core.units import amount2rate
# We can use get_inputs() to automatically retrieve the meteorological data. This is very useful for instances like Hydrotel, where this information is hidden within configuration files.
ds_in = ht.get_inputs()
ds_in["pr"] = amount2rate(
ds_in["pr"]
) # Hydrotel-to-xclim compatibility. Precipitation in xclim needs to be a flux.
xh.utils.health_checks(ds_in, **health_checks)
2.4. Exécution du modèle¶
Dans la plupart des cas, quelques vérifications de base seront effectuées avant d’exécuter le modèle, lorsque la fonction run()
est appelée. Dans le cas d’Hydrotel, ces vérifications seront effectuées :
Tous les fichiers mentionnés dans la configuration existent.
The meteorological dataset has the dimensions, coordinates, and variables named in its configuration file (e.g.
ERA5.nc.config
, in this example).Le jeu de données a un calendrier standard.
La fréquence est uniforme (c’est-à-dire que tous les pas de temps sont équidistants).
Les dates de début et de fin sont contenues dans le jeu de données.
Le jeu de données est complet (c’est-à-dire aucune valeur manquante).
Ce n’est que lorsque ces vérifications seront satisfaites que la fonction exécutera réellement le modèle. De plus, spécifiques à Hydrotel, on peut citer les arguments suivants :
check_missing
: bool. S’il faut vérifier les données manquantes ou non. Comme cela peut prendre beaucoup de temps, la valeur par défaut est False.dry_run
: bool. Mettez True pour imprimer simplement la ligne de commande, sans l’exécuter.
Une fois le modèle exécuté, xhydro
reformatera automatiquement le NetCDF pour le rapprocher des conventions CF et le rendre compatible avec les autres modules xhydro
. Notez que pour Hydrotel, ce reformatage ne prend actuellement en charge que l’option de sortie DEBITS_AVAL.
[14]:
# For the purpose of this example, we'll leave 'dry_run' as True.
print("Command that would be run in the terminal:")
ht.run(check_missing=True, dry_run=True)
Command that would be run in the terminal:
[14]:
'hydrotel /home/docs/checkouts/readthedocs.org/user_builds/xhydro-fr/checkouts/latest/docs/notebooks/_data/example_hydrotel/projet.csv -t 1'
[15]:
# This is how the output would look like after reformatting (which was skipped by the dry_run argument)
ht._standardise_outputs()
ht.get_streamflow()
[15]:
<xarray.Dataset> Size: 9kB Dimensions: (time: 730, station_id: 1) Coordinates: * time (time) datetime64[ns] 6kB 2001-01-01 2001-01-02 ... 2002-12-31 troncon (station_id) int64 8B ... * station_id (station_id) int64 8B 0 Data variables: streamflow (station_id, time) float32 3kB ... Attributes: Hydrotel_version:
2.5. Calage du modèle¶
WARNING This part of the documentation is still a work-in-progress. Only Raven-based models are currently implemented, but this notebook still uses the Dummy model.
Le calage du modèle consiste en une boucle de plusieurs instances où : les paramètres du modèle sont choisis, le modèle est exécuté, les résultats sont comparés aux observations. Les fonctions de calage dans xhydro
s’appuient sur SPOTPY
pour effectuer le processus d’optimisation.
When using the calibration module, 2 additional keywords are required for the model_config
:
qobs
: Contains the observed streamflow used as the calibration target.parameters
: While not necessary to provide this, it is a reserved keyword used by the optimizer.
La fonction de calage, xh.modelling.perform_calibration
, a ces paramètres :
[16]:
print(xh.modelling.perform_calibration.__doc__)
Perform calibration using SPOTPY.
This is the entrypoint for the model calibration. After setting-up the
model_config object and other arguments, calling "perform_calibration" will
return the optimal parameter set, objective function value and simulated
flows on the calibration period.
Parameters
----------
model_config : dict
The model configuration object that contains all info to run the model.
The model function called to run this model should always use this object and read-in data it requires.
It will be up to the user to provide the data that the model requires.
obj_func : str
The objective function used for calibrating. Can be any one of these:
- "abs_bias" : Absolute value of the "bias" metric
- "abs_pbias": Absolute value of the "pbias" metric
- "abs_volume_error" : Absolute value of the volume_error metric
- "agreement_index": Index of agreement
- "correlation_coeff": Correlation coefficient
- "kge" : Kling Gupta Efficiency metric (2009 version)
- "kge_mod" : Kling Gupta Efficiency metric (2012 version)
- "mae": Mean Absolute Error metric
- "mare": Mean Absolute Relative Error metric
- "mse" : Mean Square Error metric
- "nse": Nash-Sutcliffe Efficiency metric
- "r2" : r-squared, i.e. square of correlation_coeff.
- "rmse" : Root Mean Square Error
- "rrmse" : Relative Root Mean Square Error (RMSE-to-mean ratio)
- "rsr" : Ratio of RMSE to standard deviation.
bounds_high : np.array
High bounds for the model parameters to be calibrated. SPOTPY will sample parameter sets from
within these bounds. The size must be equal to the number of parameters to calibrate.
bounds_low : np.array
Low bounds for the model parameters to be calibrated. SPOTPY will sample parameter sets from
within these bounds. The size must be equal to the number of parameters to calibrate.
evaluations : int
Maximum number of model evaluations (calibration budget) to perform before stopping the calibration process.
qobs : os.PathLike or np.ndarray or xr.Dataset or xr.DataArray
Observed streamflow dataset (or path to it), used to compute the objective function.
If using a dataset, it must contain a "streamflow" variable.
algorithm : str
The optimization algorithm to use. Currently, "DDS" and "SCEUA" are available, but more can be easily added.
mask : np.array, optional
A vector indicating which values to preserve/remove from the objective function computation. 0=remove, 1=preserve.
transform : str, optional
The method to transform streamflow prior to computing the objective function. Can be one of:
Square root ('sqrt'), inverse ('inv'), or logarithmic ('log') transformation.
epsilon : scalar float
Used to add a small delta to observations for log and inverse transforms, to eliminate errors
caused by zero flow days (1/0 and log(0)). The added perturbation is equal to the mean observed streamflow
times this value of epsilon.
sampler_kwargs : dict
Contains the keywords and hyperparameter values for the optimization algorithm.
Keywords depend on the algorithm choice. Currently, SCEUA and DDS are supported with
the following default values:
- SCEUA: dict(ngs=7, kstop=3, peps=0.1, pcento=0.1)
- DDS: dict(trials=1)
Returns
-------
best_parameters : array_like
The optimized parameter set.
qsim : xr.Dataset
Simulated streamflow using the optimized parameter set.
bestobjf : float
The best objective function value.
[17]:
import numpy as np
# Prepare the model_config dictionary for the Dummy model
model_config = {
"model_name": "Dummy",
"precip": np.array([10, 11, 12, 13, 14, 15]),
"temperature": np.array([10, 3, -5, 1, 15, 0]),
"drainage_area": np.array([10]),
}
# This model has 3 parameters. This will be their possible range.
bounds_low = np.array([0, 0, 0])
bounds_high = np.array([10, 10, 10])
mask = np.array([0, 0, 0, 0, 1, 1])
[18]:
# Run the calibration
best_parameters, best_simulation, best_objfun = xhm.perform_calibration(
model_config,
obj_func="mae",
bounds_low=bounds_low,
bounds_high=bounds_high,
qobs=np.array([120, 130, 140, 150, 160, 170]),
evaluations=1000,
algorithm="DDS",
mask=mask,
sampler_kwargs={"trials": 1},
)
Initializing the Dynamically Dimensioned Search (DDS) algorithm with 1000 repetitions
The objective function will be maximized
Starting the DDS algotrithm with 1000 repetitions...
Finding best starting point for trial 1 using 5 random samples.
Initialize database...
['csv', 'hdf5', 'ram', 'sql', 'custom', 'noData']
Best solution found has obj function value of -0.12811715855738726 at 5
*** Final SPOTPY summary ***
Total Duration: 0.98 seconds
Total Repetitions: 1000
Maximal objective value: -0.128117
Corresponding parameter setting:
param0: 9.78616
param1: 0.0900735
param2: 0.115819
******************************
Best parameter set:
param0=10.0, param1=8.104135406881856, param2=4.860086688592087
Run number 32 has the lowest objectivefunction with: -9836.1357
[19]:
# The first output corresponds to the best set of parameters
best_parameters
[19]:
[10.0, 8.104135406881856, 4.860086688592087]
[20]:
# The second output corresponds to the timeseries for the best set of parameters
best_simulation
[20]:
<xarray.Dataset> Size: 96B Dimensions: (time: 6) Coordinates: * time (time) datetime64[ns] 48B 2024-01-01 2024-01-02 ... 2024-01-06 Data variables: streamflow (time) float64 48B 8.799e+03 6.528e+03 ... 1.271e+04 7.29e+03
[21]:
# The second output is the value of the objective function for the best set of parameters
best_objfun
[21]:
9836.1357444736