xhydro.modelling package¶
The Hydrotel Hydrological Model module.
Prevent circular imports by importing in a very specific order. isort:skip_file
Submodules¶
xhydro.modelling._hm module¶
Hydrological model class.
- class xhydro.modelling._hm.HydrologicalModel[source]¶
Bases :
ABC
Hydrological model class.
This class is a wrapper for the different hydrological models that can be used in xhydro.
- abstract get_inputs(**kwargs) Dataset [source]¶
Get the input data for the hydrological model.
Parameters¶
- **kwargsdict
Additional keyword arguments for the hydrological model.
Returns¶
- xr.Dataset
Input data for the hydrological model, in xarray Dataset format.
xhydro.modelling._hydrotel module¶
Class to handle Hydrotel simulations.
- class xhydro.modelling._hydrotel.Hydrotel(project_dir: str | PathLike, project_file: str, executable: str | PathLike, *, project_config: dict | None = None, simulation_config: dict | None = None, output_config: dict | None = None, use_defaults: bool = True)[source]¶
Bases :
HydrologicalModel
Class to handle Hydrotel simulations.
Parameters¶
- project_dirstr or os.PathLike
Path to the project folder.
- project_filestr
Name of the project file (e.g. “projet.csv”).
- executablestr or os.PathLike
Command to execute Hydrotel. On Windows, this should be the path to Hydrotel.exe.
- project_configdict, optional
Dictionary of configuration options to overwrite in the project file.
- simulation_configdict, optional
Dictionary of configuration options to overwrite in the simulation file (simulation.csv).
- output_configdict, optional
Dictionary of configuration options to overwrite in the output file (output.csv).
- use_defaultsbool
If True, use default configuration options loaded from xhydro/modelling/data/hydrotel_defaults/. If False, read the configuration options directly from the files in the project folder.
Notes¶
At minimum, the project folder must already exist when this function is called and either “use_defaults” must be True or “SIMULATION COURANTE” must be specified as a keyword argument in “project_config”.
- get_inputs(subset_time: bool = False, return_config=False, **kwargs) Dataset | tuple[Dataset, dict] [source]¶
Get the weather file from the simulation.
Parameters¶
- subset_timebool
If True, only return the weather data for the time period specified in the simulation configuration file.
- return_configbool
Whether to return the configuration file as well. If True, returns a tuple of (dataset, configuration).
- **kwargsdict
Keyword arguments to pass to
xarray.open_dataset()
.
Returns¶
- xr.Dataset
If “return_config” is False, returns the weather file.
- Tuple[xr.Dataset, dict]
If “return_config” is True, returns the weather file and its configuration.
- get_streamflow(**kwargs) Dataset [source]¶
Get the streamflow from the simulation.
Parameters¶
- **kwargsdict
Keyword arguments to pass to
xarray.open_dataset()
.
Returns¶
- xr.Dataset
The streamflow file.
- run(check_missing: bool = False, dry_run: bool = False, xr_open_kwargs_in: dict | None = None, xr_open_kwargs_out: dict | None = None) str | Dataset [source]¶
Run the simulation.
Parameters¶
- check_missingbool
If True, also checks for missing values in the dataset. This can be time-consuming for large datasets, so it is False by default. However, note that Hydrotel will not run if there are missing values in the input files.
- dry_runbool
If True, returns the command to run the simulation without actually running it.
- xr_open_kwargs_indict, optional
Keyword arguments to pass to
xarray.open_dataset()
when reading the input files.- xr_open_kwargs_outdict, optional
Keyword arguments to pass to
xarray.open_dataset()
when reading the raw output files.
Returns¶
- str
The command to run the simulation, if “dry_run” is True.
- xr.Dataset
The streamflow file, if “dry_run” is False.
- update_config(*, project_config: dict | None = None, simulation_config: dict | None = None, output_config: dict | None = None)[source]¶
Update the configuration options in the project, simulation, and output files.
Parameters¶
- project_configdict, optional
Dictionary of configuration options to overwrite in the project file.
- simulation_configdict, optional
Dictionary of configuration options to overwrite in the simulation file (simulation.csv).
- output_configdict, optional
Dictionary of configuration options to overwrite in the output file (output.csv).
xhydro.modelling._ravenpy_models module¶
Implement the ravenpy handler class for emulating raven models in ravenpy.
- class xhydro.modelling._ravenpy_models.RavenpyModel(model_name: str, parameters: ndarray | list[float], drainage_area: ndarray | float, elevation: ndarray | float, latitude, longitude, start_date, end_date, qobs_path, alt_names_flow, meteo_file, data_type, alt_names_meteo, meteo_station_properties, workdir: str | PathLike | None = None, rain_snow_fraction='RAINSNOW_DINGMAN', evaporation='PET_PRIESTLEY_TAYLOR', **kwargs)[source]¶
Bases :
HydrologicalModel
Implement the RavenPy model class to build and run ravenpy models.
Parameters¶
- model_name{« Blended », « GR4JCN », « HBVEC », « HMETS », « HYPR », « Mohyse », « SACSMA »}
The name of the ravenpy model to run.
- parametersnp.ndarray or list of float
The model parameters for simulation or calibration.
- drainage_areanp.ndarray or float
The watershed drainage area, in km².
- elevationnp.ndarray or float
The elevation of the watershed, in meters.
- latitudenp.ndarray or float
The latitude of the watershed centroid.
- longitudenp.ndarray or float
The longitude of the watershed centroid.
- start_datedt.datetime
The first date of the simulation.
- end_datedt.datetime
The last date of the simulation.
- qobs_pathstr or os.PathLike
The path to the dataset containing the observed streamflow.
- alt_names_flowsequence of str
# FIXME: This does not acceppt a dict, but a sequence of str. Please update the docstring. A dictionary that allows users to change the names of flow variables of their dataset to cf-compliant names.
- meteo_filestr or os.PathLike
The path to the file containing the observed meteorological data.
- data_typesequence of str
# FIXME: This does not acceppt a dict, but a sequence of str. Please update the docstring. The dictionary necessary to tell raven which variables are being fed such that it can adjust its processes internally.
- alt_names_meteodict
A dictionary that allows users to change the names of meteo variables of their dataset to cf-compliant names.
- meteo_station_propertiesdict
The properties of the weather stations providing the meteorological data. Used to adjust weather according to differences between station and catchment elevations (adiabatic gradients, etc.).
- workdirstr or os.PathLike
Path to save the .rv files and model outputs.
- rain_snow_fractionstr
The method used by raven to split total precipitation into rain and snow.
- evaporationstr
The evapotranspiration function used by raven.
- **kwargsdict
Dictionary of other parameters to feed to raven according to special cases and that are allowed by the raven documentation.
- get_inputs() Dataset [source]¶
Return the inputs used to run the ravenpy model.
Returns¶
- xr.Dataset
The observed meteorological data used to run the ravenpy model simulation.
xhydro.modelling._simplemodels module¶
Simple hydrological models.
- class xhydro.modelling._simplemodels.DummyModel(precip: DataArray, temperature: DataArray, drainage_area: float, parameters: ndarray, qobs: ndarray = None)[source]¶
Bases :
HydrologicalModel
Dummy model.
Dummy model to use as a placeholder for testing purposes.
Parameters¶
- precipxr.DataArray
Daily precipitation in mm.
- temperaturexr.DataArray
Daily average air temperature in °C.
- drainage_areafloat
Drainage area of the catchment.
- parametersnp.ndarray
Model parameters, length 3.
- qobsnp.ndarray, optional
Observed streamflow in m3/s.
- get_inputs() Dataset [source]¶
Return the input data for the Dummy model.
Returns¶
- xr.Dataset
Input data for the Dummy model, in xarray Dataset format.
xhydro.modelling.calibration module¶
Calibration package for hydrological models.
This package contains the main framework for hydrological model calibration. It uses the spotpy calibration package applied on a « model_config » object. This object is meant to be a container that can be used as needed by any hydrologic model. For example, it can store datasets directly, paths to datasets (nc files or other), csv files, basically anything that can be stored in a dictionary.
It then becomes the user’s responsibility to ensure that required data for a given model be provided in the model_config object both in the data preparation stage and in the hydrological model implementation. This can be addressed by a set of pre-defined codes for given model structures.
For example, for GR4J, only small datasets are required and can be stored directly in the model_config dictionary. However, for Hydrotel or Raven models, maybe it is better to pass paths to netcdf files which can be passed to the models. This will require pre- and post-processing, but this can easily be handled at the stage where we create a hydrological model and prepare the data.
The calibration aspect then becomes trivial:
A model_config object is passed to the calibrator.
Lower and upper bounds for calibration parameters are defined and passed
An objective function, optimizer and hyperparameters are also passed.
The calibrator uses this information to develop parameter sets that are then passed as inputs to the « model_config » object.
The calibrator launches the desired hydrological model with the model_config object (now containing the parameter set) as input.
The appropriate hydrological model function then parses « model_config », takes the parameters and required data, launches a simulation and returns simulated flow (Qsim).
The calibration package then compares Qobs and Qsim and computes the objective function value, and returns this to the sampler that will then repeat the process to find optimal parameter sets.
The code returns the best parameter set, objective function value, and we also return the simulated streamflow on the calibration period for user convenience.
This system has the advantage of being extremely flexible, robust, and efficient as all data can be either in-memory or only the reference to the required datasets on disk is passed around the callstack.
Currently, the model_config object has 3 mandatory keywords for the package to run correctly in all instances:
- model_config[« Qobs »]: Contains the observed streamflow used as the
calibration target.
- model_config[« model_name »]: Contains a string referring to the
hydrological model to be run.
- model_config[« parameters »]: While not necessary to provide this, it is
a reserved keyword used by the optimizer.
Any comments are welcome!
- xhydro.modelling.calibration.perform_calibration(model_config: dict, obj_func: str, bounds_high: ndarray | list[float | int], bounds_low: ndarray | list[float | int], evaluations: int, algorithm: str = 'DDS', mask: ndarray | list[float | int] | None = None, transform: str | None = None, epsilon: float = 0.01, sampler_kwargs: dict | None = None)[source]¶
Perform calibration using SPOTPY.
This is the entrypoint for the model calibration. After setting-up the model_config object and other arguments, calling « perform_calibration » will return the optimal parameter set, objective function value and simulated flows on the calibration period.
Parameters¶
- model_configdict
The model configuration object that contains all info to run the model. The model function called to run this model should always use this object and read-in data it requires. It will be up to the user to provide the data that the model requires.
- obj_funcstr
The objective function used for calibrating. Can be any one of these:
« abs_bias » : Absolute value of the « bias » metric
« abs_pbias »: Absolute value of the « pbias » metric
« abs_volume_error » : Absolute value of the volume_error metric
« agreement_index »: Index of agreement
« correlation_coeff »: Correlation coefficient
« kge » : Kling Gupta Efficiency metric (2009 version)
« kge_mod » : Kling Gupta Efficiency metric (2012 version)
« mae »: Mean Absolute Error metric
« mare »: Mean Absolute Relative Error metric
« mse » : Mean Square Error metric
« nse »: Nash-Sutcliffe Efficiency metric
« r2 » : r-squared, i.e. square of correlation_coeff.
« rmse » : Root Mean Square Error
« rrmse » : Relative Root Mean Square Error (RMSE-to-mean ratio)
« rsr » : Ratio of RMSE to standard deviation.
- bounds_highnp.array
High bounds for the model parameters to be calibrated. SPOTPY will sample parameter sets from within these bounds. The size must be equal to the number of parameters to calibrate.
- bounds_lownp.array
Low bounds for the model parameters to be calibrated. SPOTPY will sample parameter sets from within these bounds. The size must be equal to the number of parameters to calibrate.
- evaluationsint
Maximum number of model evaluations (calibration budget) to perform before stopping the calibration process.
- algorithmstr
The optimization algorithm to use. Currently, « DDS » and « SCEUA » are available, but more can be easily added.
- masknp.array, optional
A vector indicating which values to preserve/remove from the objective function computation. 0=remove, 1=preserve.
- transformstr, optional
The method to transform streamflow prior to computing the objective function. Can be one of: Square root (“sqrt”), inverse (“inv”), or logarithmic (“log”) transformation.
- epsilonscalar float
Used to add a small delta to observations for log and inverse transforms, to eliminate errors caused by zero flow days (1/0 and log(0)). The added perturbation is equal to the mean observed streamflow times this value of epsilon.
- sampler_kwargsdict
Contains the keywords and hyperparameter values for the optimization algorithm. Keywords depend on the algorithm choice. Currently, SCEUA and DDS are supported with the following default values: - SCEUA: dict(ngs=7, kstop=3, peps=0.1, pcento=0.1) - DDS: dict(trials=1)
Returns¶
- best_parametersarray_like
The optimized parameter set.
- qsimxr.Dataset
Simulated streamflow using the optimized parameter set.
- bestobjffloat
The best objective function value.
xhydro.modelling.hydrological_modelling module¶
Hydrological modelling framework.
- xhydro.modelling.hydrological_modelling.format_input(ds: Dataset, model: str, convert_calendar_missing: float | str | dict = nan, save_as: str | PathLike | None = None, **kwargs) tuple[Dataset, dict] [source]¶
Reformat CF-compliant meteorological data for use in hydrological models.
Parameters¶
- dsxr.Dataset
A dataset containing the meteorological data. See the « Notes » section for more information on the expected format.
- modelstr
The name of the hydrological model to use. Currently supported models are: « Hydrotel ».
- convert_calendar_missingfloat, str, dict, optional
Upon conversion of the calendar, missing values will be filled with this value. Default is np.nan. If the value is “interpolate”, the new dates will be linearly interpolated over time. A dictionary can be used to specify a different fill value for each variable. Keys should be the standard names of the variables (first entry in the list of names in the « Notes » section).
- save_asstr, optional
Where to save the reformatted data. If None, the data will not be saved. This can be useful when multiple files are needed for a single model run (e.g. Hydrotel needs a configuration file).
- **kwargsdict
Additional keyword arguments to pass to the save function.
Returns¶
- tuple[xr.Dataset, dict]
The reformatted dataset and, if applicable, the configuration for the model.
Notes¶
The input dataset should be CF-compliant. This function will attempt to detect the variables based on the standard_name attribute, the cell_methods attribute, or the variable name (AMIP column) taken from https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html.
Specifically:
If using 1D time series, the station dimension should have an attribute cf_role set to « timeseries_id ».
Units don’t need to be canonical, but they should be convertible to the expected units and be understood by xclim.
- The following attempts will be made to detect the variables:
- Longitude:
standard_name: « longitude »
variable name: « lon », « longitude »
- Latitude:
standard_name: « latitude »
variable name: « lat », « latitude »
- Elevation:
standard_name: « surface_altitude »
variable name: « orog », « z », « altitude », « elevation », « height »
- Precipitation:
standard_name: « precipitation » (e.g. « lwe_thickness_of_precipitation_amount »)
variable name: « pr », « precip », « precipitation »
- Maximum temperature:
standard_name: « air_temperature »
cell_methods: « time: maximum »
variable name: « tasmax », « tmax », « temperature_max »
- Minimum temperature:
standard_name: « air_temperature »
cell_methods: « time: minimum »
variable name: « tasmin », « tmin », « temperature_min »
Hydrotel requires the following variables: [« longitude », « latitude », « altitude », « time », « tasmax », « tasmin », « pr »].
- xhydro.modelling.hydrological_modelling.get_hydrological_model_inputs(model_name, required_only: bool = False) tuple[dict, str] [source]¶
Get the required inputs for a given hydrological model.
Parameters¶
- model_namestr
The name of the hydrological model to use. Currently supported models are: « Hydrotel ».
- required_onlybool
If True, only the required inputs will be returned.
Returns¶
- dict
A dictionary containing the required configuration for the hydrological model.
- str
The documentation for the hydrological model.
- xhydro.modelling.hydrological_modelling.hydrological_model(model_config)[source]¶
Initialize an instance of a hydrological model.
Parameters¶
- model_configdict
A dictionary containing the configuration for the hydrological model. Must contain a key « model_name » with the name of the model to use: « Hydrotel ». The required keys depend on the model being used. Use the function get_hydrological_model_inputs to get the required keys for a given model.
Returns¶
- Hydrotel or DummyModel
An instance of the hydrological model.
xhydro.modelling.obj_funcs module¶
Objective function package for xhydro, for calibration and model evaluation.
This package provides a flexible suite of popular objective function metrics in hydrological modelling and hydrological model calibration. The main function “get_objective_function” returns the value of the desired objective function while allowing users to customize many aspects:
1- Select the objective function to run; 2- Allow providing a mask to remove certain elements from the objective function calculation (e.g. for odd/even year calibration, or calibration on high or low flows only, or any custom setup). 3- Apply a transformation on the flows to modify the behaviour of the objective function calculation (e.g taking the log, inverse or square root transform of the flows before computing the objective function).
This function also contains some tools and inputs reserved for the calibration toolbox, such as the ability to take the negative of the objective function to maximize instead of minimize a metric according to the needs of the optimizing algorithm.
- xhydro.modelling.obj_funcs.get_objective_function(qobs: ndarray | Dataset, qsim: ndarray | Dataset, obj_func: str = 'rmse', take_negative: bool = False, mask: ndarray | Dataset | None = None, transform: str | None = None, epsilon: float | None = None)[source]¶
Entrypoint function for the objective function calculation.
More can be added by adding the function to this file and adding the option in this function.
Notes¶
All data corresponding to NaN values in the observation set are removed from the calculation. If a mask is passed, it must be the same size as the qsim and qobs vectors. If any NaNs are present in the qobs dataset, all corresponding data in the qobs, qsim and mask will be removed prior to passing to the processing function.
Parameters¶
- qobsarray_like
Vector containing the Observed streamflow to be used in the objective function calculation. It is the target to attain.
- qsimarray_like
Vector containing the Simulated streamflow as generated by the hydrological model. It is modified by changing parameters and resumulating the hydrological model.
- obj_funcstr
String representing the objective function to use in the calibration. Options must be one of the accepted objective functions: - « abs_bias » : Absolute value of the « bias » metric - « abs_pbias »: Absolute value of the « pbias » metric - « abs_volume_error » : Absolute value of the volume_error metric - « agreement_index »: Index of agreement - « bias » : Bias metric - « correlation_coeff »: Correlation coefficient - « kge » : Kling Gupta Efficiency metric (2009 version) - « kge_mod » : Kling Gupta Efficiency metric (2012 version) - « mae »: Mean Absolute Error metric - « mare »: Mean Absolute Relative Error metric - « mse » : Mean Square Error metric - « nse »: Nash-Sutcliffe Efficiency metric - « pbias » : Percent bias (relative bias) - « r2 » : r-squared, i.e. square of correlation_coeff. - « rmse » : Root Mean Square Error - « rrmse » : Relative Root Mean Square Error (RMSE-to-mean ratio) - « rsr » : Ratio of RMSE to standard deviation. - « volume_error »: Total volume error over the period. The default is “rmse”.
- take_negativebool
Used to force the objective function to be multiplied by minus one (-1) such that it is possible to maximize it if the optimizer is a minimizer and vice versa. Should always be set to False unless required by an optimization setup, which is handled internally and transparently to the user. The default is False.
- maskarray_like
Array of 0 or 1 on which the objective function should be applied. Values of 1 indicate that the value is included in the calculation, and values of 0 indicate that the value is excluded and will have no impact on the objective function calculation. This can be useful for specific optimization strategies such as odd/even year calibration, seasonal calibration or calibration based on high/low flows. The default is None and all data are preserved.
- transformstr
Indicates the type of transformation required. Can be one of the following values: - « sqrt » : Square root transformation of the flows [sqrt(Q)] - « log » : Logarithmic transformation of the flows [log(Q)] - « inv » : Inverse transformation of the flows [1/Q] The default value is « None », by which no transformation is performed.
- epsilonfloat
Indicates the perturbation to add to the flow time series during a transformation to avoid division by zero and logarithmic transformation. The perturbation is equal to: perturbation = epsilon * mean(qobs). The default value is 0.01.
Returns¶
- float
Value of the selected objective function (obj_fun).
- xhydro.modelling.obj_funcs.transform_flows(qsim: ndarray, qobs: ndarray, transform: str | None = None, epsilon: float = 0.01) tuple[ndarray, ndarray] [source]¶
Transform flows before computing the objective function.
It is used to transform flows such that the objective function is computed on a transformed flow metric rather than on the original units of flow (ex: inverse, log-transformed, square-root)
Parameters¶
- qsimarray_like
Simulated streamflow vector.
- qobsarray_like
Observed streamflow vector.
- transformstr, optional
Indicates the type of transformation required. Can be one of the following values: - « sqrt » : Square root transformation of the flows [sqrt(Q)] - « log » : Logarithmic transformation of the flows [log(Q)] - « inv » : Inverse transformation of the flows [1/Q] The default value is « None », by which no transformation is performed.
- epsilonfloat
Indicates the perturbation to add to the flow time series during a transformation to avoid division by zero and logarithmic transformation. The perturbation is equal to: perturbation = epsilon * mean(qobs). The default value is 0.01.
Returns¶
- qsimarray_like
Transformed simulated flow according to user request.
- qobsarray_like
Transformed observed flow according to user request.