xhydro package¶
Hydrological analysis library built with xarray.
Subpackages¶
- xhydro.frequency_analysis package
- xhydro.modelling package
- xhydro.optimal_interpolation package
- xhydro.testing package
Submodules¶
xhydro.cc module¶
Module to compute climate change statistics using xscen functions.
- xhydro.cc.climatological_op(ds: Dataset, *, op: str | dict = 'mean', window: int | None = None, min_periods: int | float | None = None, stride: int = 1, periods: list[str] | list[list[str]] | None = None, rename_variables: bool = True, to_level: str = 'climatology', horizons_as_dim: bool = False) Dataset [source]¶
Perform an operation “op” over time, for given time periods, respecting the temporal resolution of ds.
Parameters¶
- dsxr.Dataset
Dataset to use for the computation.
- opstr or dict
Operation to perform over time. The operation can be any method name of xarray.core.rolling.DatasetRolling, “linregress”, or a dictionary. If “op” is a dictionary, the key is the operation name and the value is a dict of kwargs accepted by the operation. While other operations are technically possible, the following are recommended and tested: [“max”, “mean”, “median”, “min”, “std”, “sum”, “var”, “linregress”]. Operations beyond methods of xarray.core.rolling.DatasetRolling include:
“linregress” : Computes the linear regression over time, using scipy.stats.linregress and employing years as regressors. The output will have a new dimension “linreg_param” with coordinates: [“slope”, “intercept”, “rvalue”, “pvalue”, “stderr”, “intercept_stderr”].
Only one operation per call is supported, so len(op)==1 if a dict.
- windowint, optional
Number of years to use for the rolling operation. If left at None and periods is given, window will be the size of the first period. Hence, if periods are of different lengths, the shortest period should be passed first. If left at None and periods is not given, the window will be the size of the input dataset.
- min_periodsint or float, optional
For the rolling operation, minimum number of years required for a value to be computed. If left at None and the xrfreq is either QS or AS and doesn’t start in January, min_periods will be one less than window. Otherwise, if left at None, it will be deemed the same as “window”. If passed as a float value between 0 and 1, this will be interpreted as the floor of the fraction of the window size.
- strideint
Stride (in years) at which to provide an output from the rolling window operation.
- periodslist of str or list of lists of str, optional
Either [start, end] or list of [start, end] of continuous periods to be considered. This is needed when the time axis of ds contains some jumps in time. If None, the dataset will be considered continuous.
- rename_variablesbool
If True, “_clim_{op}” will be added to variable names.
- to_levelstr, optional
The processing level to assign to the output. If None, the processing level of the inputs is preserved.
- horizons_as_dimbool
If True, the output will have “horizon” and the frequency as “month”, “season” or “year” as dimensions and coordinates. The “time” coordinate will be unstacked to horizon and frequency dimensions. Horizons originate from periods and/or windows and their stride in the rolling operation.
Returns¶
- xr.Dataset
Dataset with the results from the climatological operation.
- xhydro.cc.compute_deltas(ds: Dataset, reference_horizon: str | Dataset, *, kind: str | dict = '+', rename_variables: bool = True, to_level: str | None = 'deltas') Dataset [source]¶
Compute deltas in comparison to a reference time period, respecting the temporal resolution of ds.
Parameters¶
- dsxr.Dataset
Dataset to use for the computation.
- reference_horizonstr or xr.Dataset
Either a YYYY-YYYY string corresponding to the “horizon” coordinate of the reference period, or a xr.Dataset containing the climatological mean.
- kindstr or dict
[“+”, “/”, “%”] Whether to provide absolute, relative, or percentage deltas. Can also be a dictionary separated per variable name.
- rename_variablesbool
If True, “_delta_YYYY-YYYY” will be added to variable names.
- to_levelstr, optional
The processing level to assign to the output. If None, the processing level of the inputs is preserved.
Returns¶
- xr.Dataset
Returns a Dataset with the requested deltas.
- xhydro.cc.ensemble_stats(datasets: dict | list[str | PathLike] | list[Dataset] | list[DataArray] | Dataset, statistics: dict, *, create_kwargs: dict | None = None, weights: DataArray | None = None, common_attrs_only: bool = True, to_level: str = 'ensemble') Dataset [source]¶
Create an ensemble and computes statistics on it.
Parameters¶
- datasetsdict or list of [str, os.PathLike, Dataset or DataArray], or Dataset
List of file paths or xarray Dataset/DataArray objects to include in the ensemble. A dictionary can be passed instead of a list, in which case the keys are used as coordinates along the new realization axis. Tip: With a project catalog, you can do: datasets = pcat.search(**search_dict).to_dataset_dict(). If a single Dataset is passed, it is assumed to already be an ensemble and will be used as is. The “realization” dimension is required.
- statisticsdict
xclim.ensembles statistics to be called. Dictionary in the format {function: arguments}. If a function requires “weights”, you can leave it out of this dictionary and it will be applied automatically if the “weights” argument is provided. See the Notes section for more details on robustness statistics, which are more complex in their usage.
- create_kwargsdict, optional
Dictionary of arguments for xclim.ensembles.create_ensemble.
- weightsxr.DataArray, optional
Weights to apply along the “realization” dimension. This array cannot contain missing values.
- common_attrs_onlybool
If True, keeps only the global attributes that are the same for all datasets and generate new id. If False, keeps global attrs of the first dataset (same behaviour as xclim.ensembles.create_ensemble)
- to_levelstr
The processing level to assign to the output.
Returns¶
- xr.Dataset
Dataset with ensemble statistics
Notes¶
The positive fraction in “change_significance” and “robustness_fractions” is calculated by xclim using “v > 0”, which is not appropriate for relative deltas. This function will attempt to detect relative deltas by using the “delta_kind” attribute (“rel.”, “relative”, “*”, or “/”) and will apply “v - 1” before calling the function.
The “robustness_categories” statistic requires the outputs of “robustness_fractions”. Thus, there are two ways to build the “statistics” dictionary:
Having “robustness_fractions” and “robustness_categories” as separate entries in the dictionary. In this case, all outputs will be returned.
Having “robustness_fractions” as a nested dictionary under “robustness_categories”. In this case, only the robustness categories will be returned.
A “ref” DataArray can be passed to “change_significance” and “robustness_fractions”, which will be used by xclim to compute deltas and perform some significance tests. However, this supposes that both “datasets” and “ref” are still timeseries (e.g. annual means), not climatologies where the “time” dimension represents the period over which the climatology was computed. Thus, using “ref” is only accepted if “robustness_fractions” (or “robustness_categories”) is the only statistic being computed.
If you want to use compute a robustness statistic on a climatology, you should first compute the climatologies and deltas yourself, then leave “ref” as None and pass the deltas as the “datasets” argument. This will be compatible with other statistics.
See Also¶
xclim.ensembles._base.create_ensemble, xclim.ensembles._base.ensemble_percentiles, xclim.ensembles._base.ensemble_mean_std_max_min, xclim.ensembles._robustness.robustness_fractions, xclim.ensembles._robustness.robustness_categories, xclim.ensembles._robustness.robustness_coefficient,
- xhydro.cc.produce_horizon(ds: Dataset, indicators: str | PathLike | Sequence[Indicator] | Sequence[tuple[str, Indicator]] | ModuleType, *, periods: list[str] | list[list[str]] | None = None, warminglevels: dict | None = None, to_level: str | None = 'horizons') Dataset [source]¶
Compute indicators, then the climatological mean, and finally unstack dates in order to have a single dataset with all indicators of different frequencies.
Once this is done, the function drops “time” in favor of “horizon”. This function computes the indicators and does an interannual mean. It stacks the season and month in different dimensions and adds a dimension horizon for the period or the warming level, if given.
Parameters¶
- dsxr.Dataset
Input dataset with a time dimension.
- indicatorsUnion[str, os.PathLike, Sequence[Indicator], Sequence[Tuple[str, Indicator]], ModuleType]
Indicators to compute. It will be passed to the indicators argument of xs.compute_indicators.
- periodslist of str or list of lists of str, optional
Either [start, end] or list of [start_year, end_year] for the period(s) to be evaluated. If both periods and warminglevels are None, the full time series will be used.
- warminglevelsdict, optional
Dictionary of arguments to pass to py:func:xscen.subset_warming_level. If “wl” is a list, the function will be called for each value and produce multiple horizons. If both periods and warminglevels are None, the full time series will be used.
- to_levelstr, optional
The processing level to assign to the output. If there is only one horizon, you can use « {wl} », « {period0} » and « {period1} » in the string to dynamically include that information in the processing level.
Returns¶
- xr.Dataset
Horizon dataset.
- xhydro.cc.sampled_indicators(ds: Dataset, deltas: Dataset, delta_type: str, *, ds_weights: DataArray | None = None, delta_weights: DataArray | None = None, n: int = 50000, seed: int | None = None, return_dist: bool = False) Dataset | tuple[Dataset, Dataset, Dataset, Dataset] [source]¶
Compute future indicators using a perturbation approach and random sampling.
Parameters¶
- dsxr.Dataset
Dataset containing the historical indicators. The indicators are expected to be represented by a distribution of pre-computed percentiles. The percentiles should be stored in either a dimension called « percentile » [0, 100] or « quantile » [0, 1].
- deltasxr.Dataset
Dataset containing the future deltas to apply to the historical indicators.
- delta_typestr
Type of delta provided. Must be one of [“absolute”, “percentage”].
- ds_weightsxr.DataArray, optional
Weights to use when sampling the historical indicators, for dimensions other than “percentile”/”quantile”. Dimensions not present in this Dataset, or if None, will be sampled uniformly unless they are shared with “deltas”.
- delta_weightsxr.DataArray, optional
Weights to use when sampling the deltas, such as along the “realization” dimension. Dimensions not present in this Dataset, or if None, will be sampled uniformly unless they are shared with “ds”.
- nint
Number of samples to generate.
- seedint, optional
Seed to use for the random number generator.
- return_distbool
Whether to return the full distributions (ds, deltas, fut) or only the percentiles.
Returns¶
- fut_pctxr.Dataset
Dataset containing the future percentiles.
- ds_distxr.Dataset
The historical distribution, stacked along the “sample” dimension.
- deltas_distxr.Dataset
The delta distribution, stacked along the “sample” dimension.
- fut_distxr.Dataset
The future distribution, stacked along the “sample” dimension.
Notes¶
The future percentiles are computed as follows: 1. Sample “n” values from the historical distribution, weighting the percentiles by their associated coverage. 2. Sample “n” values from the delta distribution, using the provided weights. 3. Create the future distribution by applying the sampled deltas to the sampled historical distribution, element-wise. 4. Compute the percentiles of the future distribution.
xhydro.gis module¶
Module to compute geospatial operations useful in hydrology.
- xhydro.gis.land_use_classification(gdf: GeoDataFrame, unique_id: str | None = None, output_format: str = 'geopandas', collection='io-lulc-9-class', year: str | int = 'latest') GeoDataFrame | Dataset [source]¶
Calculate land use classification.
Calculate land use classification for each polygon from a gpd.GeoDataFrame. By default, the classes are generated from the Planetary Computer’s STAC catalog using the 10m Annual Land Use Land Cover dataset.
Parameters¶
- gdfgpd.GeoDataFrame
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- unique_idstr
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- output_formatstr
One of either xarray (or xr.Dataset) or geopandas (or gpd.GeoDataFrame).
- collectionstr
Collection name from the Planetary Computer STAC Catalog.
- yearstr | int
Land use dataset year between 2017 and 2022.
Returns¶
- gpd.GeoDataFrame or xr.Dataset
Output dataset containing the watershed properties.
Warnings¶
This function relies on the Microsoft Planetary Computer’s STAC Catalog to retrieve the Digital Elevation Model (DEM) data.
- xhydro.gis.land_use_plot(gdf: GeoDataFrame, idx: int = 0, unique_id: str | None = None, collection: str = 'io-lulc-9-class', year: str | int = 'latest') None [source]¶
Plot a land use map for a specific year and watershed.
Parameters¶
- gdfgpd.GeoDataFrame
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- idxint
Index to select in gpd.GeoDataFrame.
- unique_idstr
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- collectionstr
Collection name from the Planetary Computer STAC Catalog.
- yearstr | int
Land use dataset year between 2017 and 2022.
Returns¶
- None
Nothing to return.
Warnings¶
This function relies on the Microsoft Planetary Computer’s STAC Catalog to retrieve the Digital Elevation Model (DEM) data.
- xhydro.gis.surface_properties(gdf: GeoDataFrame, unique_id: str | None = None, projected_crs: int = 6622, output_format: str = 'geopandas', operation: str = 'mean', dataset_date: str = '2021-04-22', collection: str = 'cop-dem-glo-90') GeoDataFrame | Dataset [source]¶
Surface properties for watersheds.
Surface properties are calculated using Copernicus’s GLO-90 Digital Elevation Model. By default, the dataset has a geographic coordinate system (EPSG: 4326) and this function expects a projected crs for more accurate results.
The calculated properties are : - elevation (meters) - slope (degrees) - aspect ratio (degrees)
Parameters¶
- gdfgpd.GeoDataFrame
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- unique_idstr, optional
The column name in the GeoDataFrame that serves as a unique identifier.
- projected_crsint
The projected coordinate reference system (crs) to utilize for calculations, such as determining watershed area.
- output_formatstr
One of either xarray (or xr.Dataset) or geopandas (or gpd.GeoDataFrame).
- operationstr
Aggregation statistics such as mean or sum.
- dataset_datestr
Date (%Y-%m-%d) for which to select the imagery from the dataset. Date must be available.
- collectionstr
Collection name from the Planetary Computer STAC Catalog. Default is cop-dem-glo-90.
Returns¶
- gpd.GeoDataFrame or xr.Dataset
Output dataset containing the surface properties.
Warnings¶
This function relies on the Microsoft Planetary Computer’s STAC Catalog to retrieve the Digital Elevation Model (DEM) data.
- xhydro.gis.watershed_delineation(*, coordinates: list[tuple] | tuple | None = None, map: Map | None = None) GeoDataFrame [source]¶
Calculate watershed delineation from pour point.
Watershed delineation can be computed at any location in North America using HydroBASINS (hybas_na_lev01-12_v1c). The process involves assessing all upstream sub-basins from a specified pour point and consolidating them into a unified watershed.
Parameters¶
- coordinateslist of tuple, tuple, optional
Geographic coordinates (longitude, latitude) for the location where watershed delineation will be conducted.
- mapleafmap.Map, optional
If the function receives both a map and coordinates as inputs, it will generate and display watershed boundaries on the map. Additionally, any markers present on the map will be transformed into corresponding watershed boundaries for each marker.
Returns¶
- gpd.GeoDataFrame
GeoDataFrame containing the watershed boundaries.
Warnings¶
This function relies on an Amazon S3-hosted dataset to delineate watersheds.
- xhydro.gis.watershed_properties(gdf: GeoDataFrame, unique_id: str | None = None, projected_crs: int = 6622, output_format: str = 'geopandas') GeoDataFrame | Dataset [source]¶
Watershed properties extracted from a gpd.GeoDataFrame.
The calculated properties are : - area - perimeter - gravelius - centroid coordinates
Parameters¶
- gdfgpd.GeoDataFrame
GeoDataFrame containing watershed polygons. Must have a defined .crs attribute.
- unique_idstr, optional
The column name in the GeoDataFrame that serves as a unique identifier.
- projected_crsint
The projected coordinate reference system (crs) to utilize for calculations, such as determining watershed area.
- output_formatstr
One of either xarray (or xr.Dataset) or geopandas (or gpd.GeoDataFrame).
Returns¶
- gpd.GeoDataFrame or xr.Dataset
Output dataset containing the watershed properties.
xhydro.indicators module¶
Module to compute indicators using xclim’s build_indicator_module_from_yaml.
- xhydro.indicators.compute_indicators(ds: Dataset, indicators: str | PathLike | Sequence[Indicator] | Sequence[tuple[str, Indicator]] | ModuleType, *, periods: list[str] | list[list[str]] | None = None, restrict_years: bool = True, to_level: str | None = 'indicators', rechunk_input: bool = False) dict [source]¶
Calculate variables and indicators based on a YAML call to xclim.
The function cuts the output to be the same years as the inputs. Hence, if an indicator creates a timestep outside the original year range (e.g. the first DJF for QS-DEC), it will not appear in the output.
Parameters¶
- dsxr.Dataset
Dataset to use for the indicators.
- indicatorsUnion[str, os.PathLike, Sequence[Indicator], Sequence[tuple[str, Indicator]], ModuleType]
Path to a YAML file that instructs on how to calculate missing variables. Can also be only the « stem », if translations and custom indices are implemented. Can be the indicator module directly, or a sequence of indicators or a sequence of tuples (indicator name, indicator) as returned by iter_indicators().
- periodslist of str or list of lists of str, optional
Either [start, end] or list of [start, end] of continuous periods over which to compute the indicators. This is needed when the time axis of ds contains some jumps in time. If None, the dataset will be considered continuous.
- restrict_yearsbool
If True, cut the time axis to be within the same years as the input. This is mostly useful for frequencies that do not start in January, such as QS-DEC. In that instance, xclim would start on previous_year-12-01 (DJF), with a NaN. restrict_years will cut that first timestep. This should have no effect on YS and MS indicators.
- to_levelstr, optional
The processing level to assign to the output. If None, the processing level of the inputs is preserved.
- rechunk_inputbool
If True, the dataset is rechunked with
flox.xarray.rechunk_for_blockwise()
according to the resampling frequency of the indicator. Each rechunking is done once per frequency withxscen.utils.rechunk_for_resample()
.
Returns¶
- dict
Dictionary (keys = timedeltas) with indicators separated by temporal resolution.
See Also¶
xclim.indicators, xclim.core.indicator.build_indicator_module_from_yaml
- xhydro.indicators.compute_volume(da: DataArray, *, out_units: str = 'm3', attrs: dict | None = None) DataArray [source]¶
Compute the volume of water from a streamflow variable, keeping the same frequency.
Parameters¶
- daxr.DataArray
Streamflow variable.
- out_unitsstr
Output units. Defaults to « m3 ».
- attrsdict, optional
Attributes to add to the output variable. Default attributes for « long_name », « units », « cell_methods » and « description » will be added if not provided.
Returns¶
- xr.DataArray
Volume of water.
- xhydro.indicators.get_yearly_op(ds, op, *, input_var: str = 'streamflow', window: int = 1, timeargs: dict | None = None, missing: str = 'skip', missing_options: dict | None = None, interpolate_na: bool = False) Dataset [source]¶
Compute yearly operations on a variable.
Parameters¶
- dsxr.Dataset
Dataset containing the variable to compute the operation on.
- opstr
Operation to compute. One of [« max », « min », « mean », « sum »].
- input_varstr
Name of the input variable. Defaults to « streamflow ».
- windowint
Size of the rolling window. A « mean » operation is performed on the rolling window before the call to xclim. This parameter cannot be used with the « sum » operation.
- timeargsdict, optional
Dictionary of time arguments for the operation. Keys are the name of the period that will be added to the results (e.g. « winter », « summer », « annual »). Values are up to two dictionaries, with both being optional. The first is {“freq”: str}, where str is a frequency supported by xarray (e.g. « YS », « YS-JAN », « YS-DEC »). It needs to be a yearly frequency. Defaults to « YS-JAN ». The second is an indexer as supported by
xclim.core.calendar.select_time()
. Defaults to {}, which means the whole year. Seexclim.core.calendar.select_time()
for more information. Examples: {« winter »: {« freq »: « YS-DEC », « date_bounds »: [« 12-01 », « 02-28 »]}}, {« jan »: {« freq »: « YS », « month »: 1}}, {« annual »: {}}.- missingstr
How to handle missing values. One of « skip », « any », « at_least_n », « pct », « wmo ». See
xclim.core.missing()
for more information.- missing_optionsdict, optional
Dictionary of options for the missing values” method. See
xclim.core.missing()
for more information.- interpolate_nabool
Whether to interpolate missing values before computing the operation. Only used with the « sum » operation. Defaults to False.
Returns¶
- xr.Dataset
Dataset containing the computed operations, with one variable per indexer. The name of the variable follows the pattern {input_var}{window}_{op}_{indexer}.
Notes¶
If you want to perform a frequency analysis on a frequency that is finer than annual, simply use multiple timeargs (e.g. 1 per month) to create multiple distinct variables.
xhydro.pmp module¶
Module to compute Probable Maximum Precipitation.
- xhydro.pmp.compute_spring_and_summer_mask(snw: DataArray, thresh: str = '1 cm', window_wint_start: int = 14, window_wint_end: int = 45, spr_start: int = 60, spr_end: int = 30, freq: str = 'YS-JUL')[source]¶
Create a mask that defines the spring and summer seasons based on the snow water equivalent.
Parameters¶
- snwxarray.DataArray
Snow water equivalent. Must be a length (e.g. « mm ») or a mass (e.g. « kg m-2 »).
- threshQuantified
Threshold snow thickness to define the start and end of winter.
- window_wint_startint
Minimum number of days with snow depth above or equal to threshold to define the start of winter.
- window_wint_endint
Maximum number of days with snow depth below or equal to threshold to define the end of winter.
- spr_startint
Number of days before the end of winter to define the start of spring.
- spr_endint
Number of days after the end of winter to define the end of spring.
- freqstr
Frequency of the time axis (annual frequency). Defaults to « YS-JUL ».
Returns¶
- xr.Dataset
Dataset with two DataArrays (mask_spring and mask_summer), with values of 1 where the spring and summer criteria are met and 0 where they are not.
- xhydro.pmp.major_precipitation_events(da: DataArray, windows: list[int], quantile: float = 0.9)[source]¶
Get precipitation events that exceed a given quantile for a given time step accumulation. Based on Clavet-Gaumont et al. (2017).
Parameters¶
- daxr.DataArray
DataArray containing the precipitation values.
- windowslist of int
List of the number of time steps to accumulate precipitation.
- quantilefloat
Threshold that limits the events to those that exceed this quantile. Defaults to 0.9.
Returns¶
- xr.DataArray
Masked DataArray containing the major precipitation events.
Notes¶
- xhydro.pmp.precipitable_water(hus: DataArray, zg: DataArray, orog: DataArray, windows: list[int] | None = None, beta_func: bool = True, add_pre_lay: bool = False)[source]¶
Compute precipitable water based on Clavet-Gaumont et al. (2017) and Rousseau et al (2014).
Parameters¶
- husxr.DataArray
Specific humidity. Must have a pressure level (plev) dimension.
- zgxr.DataArray
Geopotential height. Must have a pressure level (plev) dimension.
- orogxr.DataArray
Surface altitude.
- windowslist of int, optional
Duration of the event in time steps. Defaults to [1]. Note that an additional time step will be added to the window size to account for antecedent conditions.
- beta_funcbool, optional
If True, use the beta function proposed by Boer (1982) to get the pressure layers above the topography. If False, the surface altitude is used as the lower boundary of the atmosphere assuming that the surface altitude and the geopotential height are virtually identical at low altitudes.
- add_pre_laybool, optional
If True, add the pressure layer between the surface and the lowest pressure level (e.g., at sea level). If False, only the pressure layers between the lowest and highest pressure level are considered.
Returns¶
- xr.DataArray
Precipitable water.
Notes¶
1) The precipitable water of an event is defined as the maximum precipitable water found during the entire duration of the event, extending up to one time step before the start of the event. Thus, the rolling operation made using windows is a maximum, not a sum.
beta_func = True and add_pre_lay = False follow Clavet-Gaumont et al. (2017) and Rousseau et al (2014).
https://doi.org/10.1016/j.ejrh.2017.07.003 https://doi.org/10.1016/j.jhydrol.2014.10.053 https://doi.org/10.1175/1520-0493(1982)110<1801:DEIIC>2.0.CO;2
- xhydro.pmp.precipitable_water_100y(pw: DataArray, dist: str, method: str, mf: float = 0.2, rebuild_time: bool = True)[source]¶
Compute the 100-year return period of precipitable water for each month. Based on Clavet-Gaumont et al. (2017).
Parameters¶
- pwxr.DataArray
DataArray containing the precipitable water.
- diststr
Probability distributions.
- method{« ML » or « MLE », « MM », « PWM », « APP »}
Fitting method, either maximum likelihood (ML or MLE), method of moments (MM) or approximate method (APP). Can also be the probability weighted moments (PWM), also called L-Moments, if a compatible dist object is passed.
- mffloat
Maximum majoration factor of the 100-year event compared to the maximum of the timeseries. Used as an upper limit for the frequency analysis.
- rebuild_timebool
Whether or not to reconstruct a timeseries with the same time dimensions as pw.
Returns¶
- xr.DataArray
Precipitable water for a 100-year return period.
Notes¶
- xhydro.pmp.spatial_average_storm_configurations(da, radius)[source]¶
Compute the spatial average for different storm configurations proposed by Clavet-Gaumont et al. (2017).
Parameters¶
- daxr.DataArray
DataArray containing the precipitation values.
- radiusfloat
Maximum radius of the storm.
Returns¶
- xr.DataSet
DataSet contaning the spatial averages for all the storm configurations. The y and x coordinates indicate the location of the storm. This location is determined by n//2, where n is the total number of cells for both the rows and columns in the configuration, and // represents floor division.
Notes¶
xhydro.utils module¶
Utility functions for xhydro.
- xhydro.utils.health_checks(ds: Dataset | DataArray, *, structure: dict | None = None, calendar: str | None = None, start_date: str | None = None, end_date: str | None = None, variables_and_units: dict | None = None, cfchecks: dict | None = None, freq: str | None = None, missing: dict | str | list | None = None, flags: dict | None = None, flags_kwargs: dict | None = None, return_flags: bool = False, raise_on: list | None = None) None | Dataset [source]¶
Perform a series of health checks on the dataset. Be aware that missing data checks and flag checks can be slow.
Parameters¶
- ds: xr.Dataset or xr.DataArray
Dataset to check.
- structure: dict, optional
Dictionary with keys « dims » and « coords » containing the expected dimensions and coordinates. This check will fail is extra dimensions or coordinates are found.
- calendar: str, optional
Expected calendar. Synonyms should be detected correctly (e.g. « standard » and « gregorian »).
- start_date: str, optional
To check if the dataset starts at least at this date.
- end_date: str, optional
To check if the dataset ends at least at this date.
- variables_and_units: dict, optional
Dictionary containing the expected variables and units.
- cfchecks: dict, optional
Dictionary where the key is the variable to check and the values are the cfchecks. The cfchecks themselves must be a dictionary with the keys being the cfcheck names and the values being the arguments to pass to the cfcheck. See xclim.core.cfchecks for more details.
- freq: str, optional
Expected frequency, written as the result of xr.infer_freq(ds.time).
- missing: dict or str or list of str, optional
String, list of strings, or dictionary where the key is the method to check for missing data and the values are the arguments to pass to the method. The methods are: « missing_any », « at_least_n_valid », « missing_pct », « missing_wmo ». See
xclim.core.missing()
for more details.- flags: dict, optional
Dictionary where the key is the variable to check and the values are the flags. The flags themselves must be a dictionary with the keys being the data_flags names and the values being the arguments to pass to the data_flags. If None is passed instead of a dictionary, then xclim’s default flags for the given variable are run. See
xclim.core.utils.VARIABLES
. See alsoxclim.core.dataflags.data_flags()
for the list of possible flags.- flags_kwargs: dict, optional
Additional keyword arguments to pass to the data_flags (« dims » and « freq »).
- return_flags: bool
Whether to return the Dataset created by data_flags.
- raise_on: list of str, optional
Whether to raise an error if a check fails, else there will only be a warning. The possible values are the names of the checks. Use [« all »] to raise on all checks.
Returns¶
- xr.Dataset or None
Dataset containing the flags if return_flags is True & raise_on is False for the « flags » check.
- xhydro.utils.merge_attributes(attribute: str, *inputs_list: DataArray | Dataset, new_line: str = '\n', missing_str: str | None = None, **inputs_kws: DataArray | Dataset) str [source]¶
Merge attributes from several DataArrays or Datasets.
If more than one input is given, its name (if available) is prepended as: « <input name> : <input attribute> ».
Parameters¶
- attributestr
The attribute to merge.
- *inputs_listxr.DataArray or xr.Dataset
The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their name attribute if available.
- new_linestr
The character to put between each instance of the attributes. Usually, in CF-conventions, the history attributes uses “\n” while cell_methods uses “ “.
- missing_strstr
A string that is printed if an input doesn’t have the attribute. Defaults to None, in which case the input is simply skipped.
- **inputs_kwsxr.DataArray or xr.Dataset
Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
Returns¶
- str
The new attribute made from the combination of the ones from all the inputs.
- xhydro.utils.update_history(hist_str: str, *inputs_list: DataArray | Dataset, new_name: str | None = None, **inputs_kws: DataArray | Dataset) str [source]¶
Return a history string with the timestamped message and the combination of the history of all inputs.
The new history entry is formatted as « [<timestamp>] <new_name>: <hist_str> - xhydro version: <xhydro.__version__>. »
Parameters¶
- hist_strstr
The string describing what has been done on the data.
- *inputs_listxr.DataArray or xr.Dataset
The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their « name » attribute if available.
- new_namestr, optional
The name of the newly created variable or dataset to prefix hist_msg.
- **inputs_kwsxr.DataArray or xr.Dataset
Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
Returns¶
- str
The combine history of all inputs starting with hist_str.