xhydro.frequency_analysis package¶
Frequency analysis module.
Submodules¶
xhydro.frequency_analysis.local module¶
Local frequency analysis functions and utilities.
- xhydro.frequency_analysis.local.criteria(ds: Dataset, p: Dataset) Dataset [source]¶
Compute information criteria (AIC, BIC, AICC) from fitted distributions, using the log-likelihood.
Parameters¶
- dsxr.Dataset
Dataset containing the yearly data that was fitted.
- pxr.Dataset
Dataset containing the parameters of the fitted distributions. Must have a dimension dparams containing the parameter names and a dimension scipy_dist containing the distribution names.
Returns¶
- xr.Dataset
Dataset containing the information criteria for the distributions.
- xhydro.frequency_analysis.local.fit(ds, *, distributions: list[str] | None = None, min_years: int | None = None, method: str = 'ML', periods: list[str] | list[list[str]] | None = None) Dataset [source]¶
Fit multiple distributions to data.
Parameters¶
- dsxr.Dataset
Dataset containing the data to fit. All variables will be fitted.
- distributionslist of str, optional
List of distribution names as defined in scipy.stats. See https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions. Defaults to [« genextreme », « pearson3 », « gumbel_r », « expon »].
- min_yearsint, optional
Minimum number of years required for a distribution to be fitted.
- methodstr
Fitting method. Defaults to « ML » (maximum likelihood).
- periodslist of str or list of list of str, optional
Either [start, end] or list of [start, end] of periods to be considered. If multiple periods are given, the output will have a horizon dimension. If None, all data is used.
Returns¶
- xr.Dataset
Dataset containing the parameters of the fitted distributions, with a new dimension scipy_dist containing the distribution names.
Notes¶
In order to combine the parameters of multiple distributions, the size of the dparams dimension is set to the maximum number of unique parameters between the distributions.
- xhydro.frequency_analysis.local.parametric_quantiles(p: Dataset, t: float | list[float], mode: str = 'max') Dataset [source]¶
Compute quantiles from fitted distributions.
Parameters¶
- pxr.Dataset
Dataset containing the parameters of the fitted distributions. Must have a dimension dparams containing the parameter names and a dimension scipy_dist containing the distribution names.
- tfloat or list of float
Return period(s) in years.
- mode{“max”, “min”}
Whether the return period is the probability of exceedance (max) or non-exceedance (min).
Returns¶
- xr.Dataset
Dataset containing the quantiles of the distributions.
xhydro.frequency_analysis.regional module¶
Provides functions for performing regional frequency analysis on hydrological data.
Key Functions: - cluster_indices: Get indices of elements with a specific cluster number. - get_groups_indices: Retrieve indices of groups from a clustering result. - get_group_from_fit: Obtain indices of groups from a fitted model. - fit_pca: Perform Principal Component Analysis (PCA) on the input dataset. - moment_l_vector: Calculate L-moments for multiple datasets. - moment_l: Compute various L-moments and L-moment ratios for a given dataset.
The module utilizes NumPy, pandas, scikit-learn, and xarray for efficient data manipulation and analysis.
- Example Usage:
import xarray as xr from regional_frequency_analysis import fit_pca, moment_l_vector
# Load your dataset ds = xr.open_dataset(“your_data.nc”)
# Perform PCA data_pca, pca_obj = fit_pca(ds, n_components=3)
# Calculate L-moments l_moments = moment_l_vector(ds.values)
This module is designed for hydrologists and data scientists working with regional frequency analysis in water resources.
- xhydro.frequency_analysis.regional.calc_h_z(ds_groups: Dataset, ds_moments_groups: Dataset, kap: object, seed: int | None = None) Dataset [source]¶
Calculate heterogeneity measure H and Z-score for regional frequency analysis.
Parameters¶
- ds_groupsxr.Dataset
Dataset containing grouped data.
- ds_moments_groupsxr.Dataset
Dataset containing L-moments for grouped data.
- kapscipy.stats.kappa3
Kappa3 distribution object.
- seedint, optional
Random seed for reproducibility. Default is None.
Returns¶
- xr.Dataset
Dataset containing calculated H values and Z-scores for each group.
Notes¶
This function applies the heterogeneity measure and Z-score calculations to grouped data for regional frequency analysis. It uses L-moments and the Kappa3 distribution in the process. This function does not support lazy evaluation. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).
- xhydro.frequency_analysis.regional.calc_moments(ds: Dataset) Dataset [source]¶
Calculate L-moments for multiple stations.
Parameters¶
- dsxr.Dataset
A vector of stations, where each element is an array-like object containing the data for which to calculate L-moments.
Returns¶
- xr.Dataset
L-moment dataset with a new lmom dimension.
Notes¶
NaN values in each station are removed before calculating L-moments. The function uses the moment_l function to calculate L-moments for each individual stations. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).
- xhydro.frequency_analysis.regional.calculate_rp_from_afr(ds_groups: Dataset, ds_moments_groups: Dataset, rp: array, l1: DataArray | None = None) DataArray [source]¶
Calculate return periods from Annual Flow Regime (AFR) analysis.
Parameters¶
- ds_groupsxr.Dataset
Dataset containing grouped flow data.
- ds_moments_groupsxr.Dataset
Dataset containing L-moments for grouped data.
- rparray-like
Return periods to calculate.
- l1xr.DataArray, optional
First L-moment (location) values. L-moment can be specified for ungauged catchments. If None, values are taken from ds_moments_groups.
Returns¶
- xr.DataArray
Calculated return periods for each group and specified return period.
Notes¶
This function calculates return periods using the Annual Flow Regime method. If l1 is not provided, it uses the first L-moment from ds_moments_groups. The function internally calls calculate_ic_from_AFR to compute the flood index. Equations are based on Hosking, J. R. M., & Wallis, J. R. (1997). Regional frequency analysis (p. 240).
- xhydro.frequency_analysis.regional.cluster_indices(clust_num: int | ndarray, labels_array: int | ndarray) ndarray [source]¶
Get the indices of elements with a specific cluster number using NumPy.
Parameters¶
- clust_numnumpy.ndarray or int
Cluster number to find indices for.
- labels_arraynumpy.ndarray or int
Array containing cluster labels.
Returns¶
- numpy.ndarray
Indices of elements with the specified cluster number.
- xhydro.frequency_analysis.regional.fit_pca(ds: Dataset, **kwargs) tuple [source]¶
Perform Principal Component Analysis (PCA) on the input dataset.
This function scales the input data, applies PCA transformation, and returns the transformed data along with the PCA object.
Parameters¶
- dsxr.Dataset
Input dataset to perform PCA on.
- **kwargsdict
Additional keyword arguments to pass to the PCA constructor.
Returns¶
- tuple: A tuple containing:
data_pca (xr.DataArray): PCA-transformed data with “Station” and “components” as coordinates.
obj_pca (sklearn.decomposition.PCA): Fitted PCA object.
Notes¶
The input data is scaled before PCA is applied.
The number of components in the output depends on the n_components parameter passed to PCA.
- xhydro.frequency_analysis.regional.get_group_from_fit(model: Callable, param: dict, sample: Dataset | DataArray) list [source]¶
Get indices of groups from a fit using the specified model and parameters.
Parameters¶
- modelcallable
Model class or instance with a fit method.
- paramdict
Parameters for the model.
- samplexr.Dataset or xr.DataArray
Data sample to fit the model.
Returns¶
- list :
List of indices for each non-excluded group.
- xhydro.frequency_analysis.regional.get_groups_indices(cluster: list, sample: Dataset) list [source]¶
Get indices of groups from a clustering result, excluding the group labeled -1.
Parameters¶
- clusterlist
Clustering result with labels attribute.
- samplexr.Dataset
Data sample to fit the model.
Returns¶
- list
Indices for each non-excluded group.
- xhydro.frequency_analysis.regional.group_ds(ds: Dataset, groups: list) Dataset [source]¶
Group a dataset by a list of groups.
Parameters¶
- dsxr.Dataset
The input dataset to be grouped.
- groupslist
A list of groups to be used for grouping the dataset.
Returns¶
- xr.Dataset
A new dataset with the grouped data.
- xhydro.frequency_analysis.regional.mask_h_z(ds: Dataset, thresh_h: float | None = 1, thresh_z: float | None = 1.64) DataArray [source]¶
Create a boolean mask based on heterogeneity measure H and Z-score thresholds.
Parameters¶
- dsxr.Dataset
Dataset containing H and Z values for each group.
- thresh_hfloat, optional
Threshold for the heterogeneity measure H. Default is 1.
- thresh_zfloat, optional
Threshold for the absolute Z-score. Default is 1.64.
Returns¶
- xr.DataArray
Boolean mask where True indicates groups that meet both threshold criteria.
- xhydro.frequency_analysis.regional.remove_small_regions(ds: Dataset, thresh: int = 5) Dataset [source]¶
Remove regions from the dataset that have fewer than the threshold number of stations.
Parameters¶
- dsxr.Dataset
The input dataset containing regions and stations.
- threshint, optional
The minimum number of stations required for a region to be kept. Default is 5.
Returns¶
- xr.Dataset
The dataset with small regions removed.
xhydro.frequency_analysis.uncertainities module¶
Uncertainty analysis module for hydrological frequency analysis.
This module provides functions for bootstrap sampling, distribution fitting, and uncertainty estimation in regional frequency analysis. It includes tools for generating bootstrap samples from observed data and fitted distributions, calculating L-moments, and estimating quantiles with uncertainty bounds.
- Functions:
boostrap_obs: Generate bootstrap samples from observed data. boostrap_dist: Generate bootstrap samples from a fitted distribution. fit_boot_dist: Fit distributions to bootstrap samples. calc_moments_iter: Calculate L-moments for each bootstrap sample. calc_q_iter: Calculate quantiles for each bootstrap sample and group. generate_combinations: Generate combinations of indices for sensitivity analysis.
- xhydro.frequency_analysis.uncertainities.boostrap_dist(ds_obs: Dataset, ds_params: Dataset, n_samples: int) Dataset [source]¶
Generate bootstrap samples from a fitted distribution.
Parameters¶
- ds_obsxarray.Dataset
The observed data.
- ds_paramsxarray.Dataset
The fitted distribution parameters.
- n_samplesint
The number of bootstrap samples to generate.
Returns¶
- xarray.Dataset
Bootstrap samples with dimensions [samples, time].
Notes¶
This function does not support lazy evaluation.
- xhydro.frequency_analysis.uncertainities.boostrap_obs(obs: DataArray, n_samples: int, seed: int | None = None) DataArray [source]¶
Generate bootstrap samples from observed data.
Parameters¶
- obsxarray.DataArray
The observed data to bootstrap.
- n_samplesint
The number of bootstrap samples to generate.
- seedint, optional
Seed for the random number generator.
Returns¶
- xarray.DataArray
Bootstrap samples with dimensions [samples, time].
- xhydro.frequency_analysis.uncertainities.calc_moments_iter(ds_samples: Dataset) Dataset [source]¶
Calculate L-moments for each bootstrap sample.
Parameters¶
- ds_samplesxarray.Dataset
The bootstrap samples.
Returns¶
- xarray.Dataset
L-moments for each bootstrap sample.
- xhydro.frequency_analysis.uncertainities.calc_q_iter(bv: str, var: str, ds_groups: Dataset, ds_moments_iter: Dataset, return_periods: array, small_regions_threshold: int | None = 5) DataArray [source]¶
Calculate quantiles for each bootstrap sample and group.
Parameters¶
- bvstr
The basin identifier.
- varstr
The variable name.
- ds_groupsxarray.Dataset
The grouped data.
- ds_moments_iterxarray.Dataset
The L-moments for each bootstrap sample.
- return_periodsarray-like
The return periods to calculate quantiles for.
- small_regions_thresholdint, optional
The threshold for removing small regions. Default is 5.
Returns¶
- xarray.DataArray
Quantiles for each bootstrap sample and group.