xhydro.frequency_analysis package

Frequency analysis module.

Submodules

xhydro.frequency_analysis.local module

Local frequency analysis functions and utilities.

xhydro.frequency_analysis.local.criteria(ds: Dataset, p: Dataset) Dataset[source]

Compute information criteria (AIC, BIC, AICC) from fitted distributions, using the log-likelihood.

Parameters

dsxr.Dataset

Dataset containing the yearly data that was fitted.

pxr.Dataset

Dataset containing the parameters of the fitted distributions. Must have a dimension dparams containing the parameter names and a dimension scipy_dist containing the distribution names.

Returns

xr.Dataset

Dataset containing the information criteria for the distributions.

xhydro.frequency_analysis.local.fit(ds, *, distributions: list[str] | None = None, min_years: int | None = None, method: str = 'ML', periods: list[str] | list[list[str]] | None = None) Dataset[source]

Fit multiple distributions to data.

Parameters

dsxr.Dataset

Dataset containing the data to fit. All variables will be fitted.

distributionslist of str, optional

List of distribution names as defined in scipy.stats. See https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions. Defaults to [« genextreme », « pearson3 », « gumbel_r », « expon »].

min_yearsint, optional

Minimum number of years required for a distribution to be fitted.

methodstr

Fitting method. Defaults to « ML » (maximum likelihood).

periodslist of str or list of list of str, optional

Either [start, end] or list of [start, end] of periods to be considered. If multiple periods are given, the output will have a horizon dimension. If None, all data is used.

Returns

xr.Dataset

Dataset containing the parameters of the fitted distributions, with a new dimension scipy_dist containing the distribution names.

Notes

In order to combine the parameters of multiple distributions, the size of the dparams dimension is set to the maximum number of unique parameters between the distributions.

xhydro.frequency_analysis.local.parametric_quantiles(p: Dataset, t: float | list[float], mode: str = 'max') Dataset[source]

Compute quantiles from fitted distributions.

Parameters

pxr.Dataset

Dataset containing the parameters of the fitted distributions. Must have a dimension dparams containing the parameter names and a dimension scipy_dist containing the distribution names.

tfloat or list of float

Return period(s) in years.

mode{“max”, “min”}

Whether the return period is the probability of exceedance (max) or non-exceedance (min).

Returns

xr.Dataset

Dataset containing the quantiles of the distributions.

xhydro.frequency_analysis.regional module

Provides functions for performing regional frequency analysis on hydrological data.

Key Functions: - cluster_indices: Get indices of elements with a specific cluster number. - get_groups_indices: Retrieve indices of groups from a clustering result. - get_group_from_fit: Obtain indices of groups from a fitted model. - fit_pca: Perform Principal Component Analysis (PCA) on the input dataset. - moment_l_vector: Calculate L-moments for multiple datasets. - moment_l: Compute various L-moments and L-moment ratios for a given dataset.

The module utilizes NumPy, pandas, scikit-learn, and xarray for efficient data manipulation and analysis.

Example Usage:

import xarray as xr from regional_frequency_analysis import fit_pca, moment_l_vector

# Load your dataset ds = xr.open_dataset(“your_data.nc”)

# Perform PCA data_pca, pca_obj = fit_pca(ds, n_components=3)

# Calculate L-moments l_moments = moment_l_vector(ds.values)

This module is designed for hydrologists and data scientists working with regional frequency analysis in water resources.

xhydro.frequency_analysis.regional.calc_h_z(ds_groups: Dataset, ds_moments_groups: Dataset, kap: object, seed: int | None = None) Dataset[source]

Calculate heterogeneity measure H and Z-score for regional frequency analysis.

Parameters

ds_groupsxr.Dataset

Dataset containing grouped data.

ds_moments_groupsxr.Dataset

Dataset containing L-moments for grouped data.

kapscipy.stats.kappa3

Kappa3 distribution object.

seedint, optional

Random seed for reproducibility. Default is None.

Returns

xr.Dataset

Dataset containing calculated H values and Z-scores for each group.

Notes

This function applies the heterogeneity measure and Z-score calculations to grouped data for regional frequency analysis. It uses L-moments and the Kappa3 distribution in the process. This function does not support lazy evaluation. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).

xhydro.frequency_analysis.regional.calc_moments(ds: Dataset) Dataset[source]

Calculate L-moments for multiple stations.

Parameters

dsxr.Dataset

A vector of stations, where each element is an array-like object containing the data for which to calculate L-moments.

Returns

xr.Dataset

L-moment dataset with a new lmom dimension.

Notes

NaN values in each station are removed before calculating L-moments. The function uses the moment_l function to calculate L-moments for each individual stations. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).

xhydro.frequency_analysis.regional.calculate_rp_from_afr(ds_groups: Dataset, ds_moments_groups: Dataset, rp: array, l1: DataArray | None = None) DataArray[source]

Calculate return periods from Annual Flow Regime (AFR) analysis.

Parameters

ds_groupsxr.Dataset

Dataset containing grouped flow data.

ds_moments_groupsxr.Dataset

Dataset containing L-moments for grouped data.

rparray-like

Return periods to calculate.

l1xr.DataArray, optional

First L-moment (location) values. L-moment can be specified for ungauged catchments. If None, values are taken from ds_moments_groups.

Returns

xr.DataArray

Calculated return periods for each group and specified return period.

Notes

This function calculates return periods using the Annual Flow Regime method. If l1 is not provided, it uses the first L-moment from ds_moments_groups. The function internally calls calculate_ic_from_AFR to compute the flood index. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).

xhydro.frequency_analysis.regional.cluster_indices(clust_num: int | ndarray, labels_array: int | ndarray) ndarray[source]

Get the indices of elements with a specific cluster number using NumPy.

Parameters

clust_numnumpy.ndarray or int

Cluster number to find indices for.

labels_arraynumpy.ndarray or int

Array containing cluster labels.

Returns

numpy.ndarray

Indices of elements with the specified cluster number.

xhydro.frequency_analysis.regional.fit_pca(ds: Dataset, **kwargs) tuple[source]

Perform Principal Component Analysis (PCA) on the input dataset.

This function scales the input data, applies PCA transformation, and returns the transformed data along with the PCA object.

Parameters

dsxr.Dataset

Input dataset to perform PCA on.

**kwargsdict

Additional keyword arguments to pass to the PCA constructor.

Returns

tuple: A tuple containing:
  • data_pca (xr.DataArray): PCA-transformed data with “Station” and “components” as coordinates.

  • obj_pca (sklearn.decomposition.PCA): Fitted PCA object.

Notes

  • The input data is scaled before PCA is applied.

  • The number of components in the output depends on the n_components parameter passed to PCA.

xhydro.frequency_analysis.regional.get_group_from_fit(model: Callable, param: dict, sample: Dataset | DataArray) list[source]

Get indices of groups from a fit using the specified model and parameters.

Parameters

modelcallable

Model class or instance with a fit method.

paramdict

Parameters for the model.

samplexr.Dataset or xr.DataArray

Data sample to fit the model.

Returns

list :

List of indices for each non-excluded group.

xhydro.frequency_analysis.regional.get_groups_indices(cluster: list, sample: Dataset) list[source]

Get indices of groups from a clustering result, excluding the group labeled -1.

Parameters

clusterlist

Clustering result with labels attribute.

samplexr.Dataset

Data sample to fit the model.

Returns

list

Indices for each non-excluded group.

xhydro.frequency_analysis.regional.group_ds(ds: Dataset, groups: list) Dataset[source]

Group a dataset by a list of groups.

Parameters

dsxr.Dataset

The input dataset to be grouped.

groupslist

A list of groups to be used for grouping the dataset.

Returns

xr.Dataset

A new dataset with the grouped data.

xhydro.frequency_analysis.regional.mask_h_z(ds: Dataset, thresh_h: float | None = 1, thresh_z: float | None = 1.64) DataArray[source]

Create a boolean mask based on heterogeneity measure H and Z-score thresholds.

Parameters

dsxr.Dataset

Dataset containing H and Z values for each group.

thresh_hfloat, optional

Threshold for the heterogeneity measure H. Default is 1.

thresh_zfloat, optional

Threshold for the absolute Z-score. Default is 1.64.

Returns

xr.DataArray

Boolean mask where True indicates groups that meet both threshold criteria.

xhydro.frequency_analysis.regional.remove_small_regions(ds: Dataset, thresh: int = 5) Dataset[source]

Remove regions from the dataset that have fewer than the threshold number of stations.

Parameters

dsxr.Dataset

The input dataset containing regions and stations.

threshint, optional

The minimum number of stations required for a region to be kept. Default is 5.

Returns

xr.Dataset

The dataset with small regions removed.

xhydro.frequency_analysis.uncertainities module

Uncertainty analysis module for hydrological frequency analysis.

This module provides functions for bootstrap sampling, distribution fitting, and uncertainty estimation in regional frequency analysis. It includes tools for generating bootstrap samples from observed data and fitted distributions, calculating L-moments, and estimating quantiles with uncertainty bounds.

Functions:

boostrap_obs: Generate bootstrap samples from observed data. boostrap_dist: Generate bootstrap samples from a fitted distribution. fit_boot_dist: Fit distributions to bootstrap samples. calc_moments_iter: Calculate L-moments for each bootstrap sample. calc_q_iter: Calculate quantiles for each bootstrap sample and group. generate_combinations: Generate combinations of indices for sensitivity analysis.

xhydro.frequency_analysis.uncertainities.boostrap_dist(ds_obs: Dataset, ds_params: Dataset, n_samples: int) Dataset[source]

Generate bootstrap samples from a fitted distribution.

Parameters

ds_obsxarray.Dataset

The observed data.

ds_paramsxarray.Dataset

The fitted distribution parameters.

n_samplesint

The number of bootstrap samples to generate.

Returns

xarray.Dataset

Bootstrap samples with dimensions [samples, time].

Notes

This function does not support lazy evaluation.

xhydro.frequency_analysis.uncertainities.boostrap_obs(obs: DataArray, n_samples: int, seed: int | None = None) DataArray[source]

Generate bootstrap samples from observed data.

Parameters

obsxarray.DataArray

The observed data to bootstrap.

n_samplesint

The number of bootstrap samples to generate.

seedint, optional

Seed for the random number generator.

Returns

xarray.DataArray

Bootstrap samples with dimensions [samples, time].

xhydro.frequency_analysis.uncertainities.calc_moments_iter(ds_samples: Dataset) Dataset[source]

Calculate L-moments for each bootstrap sample.

Parameters

ds_samplesxarray.Dataset

The bootstrap samples.

Returns

xarray.Dataset

L-moments for each bootstrap sample.

xhydro.frequency_analysis.uncertainities.calc_q_iter(bv: str, var: str, ds_groups: Dataset, ds_moments_iter: Dataset, return_periods: array, small_regions_threshold: int | None = 5) DataArray[source]

Calculate quantiles for each bootstrap sample and group.

Parameters

bvstr

The basin identifier.

varstr

The variable name.

ds_groupsxarray.Dataset

The grouped data.

ds_moments_iterxarray.Dataset

The L-moments for each bootstrap sample.

return_periodsarray-like

The return periods to calculate quantiles for.

small_regions_thresholdint, optional

The threshold for removing small regions. Default is 5.

Returns

xarray.DataArray

Quantiles for each bootstrap sample and group.

xhydro.frequency_analysis.uncertainities.fit_boot_dist(ds: Dataset) Dataset[source]

Fit distributions to bootstrap samples.

Parameters

dsxarray.Dataset

The bootstrap samples to fit.

Returns

xarray.Dataset

Fitted distribution parameters for each bootstrap sample.

xhydro.frequency_analysis.uncertainities.generate_combinations(da: DataArray, n: int) list[source]

Generate combinations of indices omitting up to N indices.

Parameters

daxarray.DataArray

Input DataArray.

nint

Number of indices to omit in each combination.

Returns

list

List of all combinations.