4. Module d’analyse fréquentielle¶

[1]:

# Basic imports
import hvplot.xarray
import numpy as np
import xarray as xr
import xdatasets as xd

import xhydro as xh
import xhydro.frequency_analysis as xhfa

Redefining 'percent' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining '%' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'year' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'yr' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'C' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'd' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'h' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'degrees_north' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'degrees_east' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining 'degrees' (<class 'pint.delegates.txt_defparser.plain.UnitDefinition'>)
Redefining '[speed]' (<class 'pint.delegates.txt_defparser.plain.DerivedDimensionDefinition'>)

4.3. Analyse fréquentielle locale¶

Une fois que nous avons nos maximums annuels (ou volumes/minimums), la première étape d’une analyse de fréquence locale est d’appeler xhfa.local.fit pour obtenir les paramètres de distribution. Les options sont les suivantes :

distributions: a list of SciPy distributions. Defaults to: ["expon", "gamma", "genextreme", "genpareto", "gumbel_r", "pearson3", "weibull_min"].
min_years: the minimum number of years required to fit the data.
method: the fitting method. Defaults to the maximum likelihood.

[13]:

# To speed up the Notebook, we'll only perform the analysis on a subset of variables
params = xhfa.local.fit(
    ds_4fa[["streamflow_max_spring", "volume_sum_spring"]], min_years=15
)

params

[13]:

<xarray.Dataset> Size: 4kB
Dimensions:                (id: 5, dparams: 5, scipy_dist: 7)
Coordinates: (12/16)
  * id                     (id) object 40B '020302' '020404' ... '020802'
  * dparams                (dparams) <U5 100B 'a' 'c' 'skew' 'loc' 'scale'
  * scipy_dist             (scipy_dist) <U11 308B 'expon' ... 'weibull_min'
    drainage_area          (id) float32 20B 1.09e+03 647.0 59.8 626.0 1.2e+03
    end_date               (id) datetime64[ns] 40B 2006-10-13 ... 1996-08-13
    latitude               (id) float32 20B 48.77 48.81 48.98 48.98 49.2
    ...                     ...
    source                 <U102 408B 'Ministère de l’Environnement, de la Lu...
    spatial_agg            <U9 36B 'watershed'
    start_date             (id) datetime64[ns] 40B 1989-08-12 ... 1970-01-01
    time_agg               <U4 16B 'mean'
    timestep               <U1 4B 'D'
    variable               <U10 40B 'streamflow'
Data variables:
    streamflow_max_spring  (scipy_dist, id, dparams) float64 1kB dask.array<chunksize=(1, 5, 5), meta=np.ndarray>
    volume_sum_spring      (scipy_dist, id, dparams) float64 1kB dask.array<chunksize=(1, 5, 5), meta=np.ndarray>
Attributes:
    cat:frequency:         yr
    cat:processing_level:  indicators
    cat:id:

Information Criteria such as the AIC, BIC, and AICC are useful to determine which statistical distribution is better suited to a given location. These three criteria can be computed using xhfa.local.criteria.

[14]:

criteria = xhfa.local.criteria(
    ds_4fa[["streamflow_max_spring", "volume_sum_spring"]], params
)

criteria

[14]:

<xarray.Dataset> Size: 3kB
Dimensions:                (id: 5, scipy_dist: 7, criterion: 3)
Coordinates: (12/16)
    drainage_area          (id) float32 20B 1.09e+03 647.0 59.8 626.0 1.2e+03
    end_date               (id) datetime64[ns] 40B 2006-10-13 ... 1996-08-13
  * id                     (id) object 40B '020302' '020404' ... '020802'
    latitude               (id) float32 20B 48.77 48.81 48.98 48.98 49.2
    longitude              (id) float32 20B -64.52 -64.92 -64.43 -64.7 -65.29
    name                   (id) object 40B 'Saint' 'York' ... 'Madeleine'
    ...                     ...
    start_date             (id) datetime64[ns] 40B 1989-08-12 ... 1970-01-01
    time_agg               <U4 16B 'mean'
    timestep               <U1 4B 'D'
    variable               <U10 40B 'streamflow'
  * scipy_dist             (scipy_dist) <U11 308B 'expon' ... 'weibull_min'
  * criterion              (criterion) <U4 48B 'aic' 'bic' 'aicc'
Data variables:
    streamflow_max_spring  (scipy_dist, id, criterion) float64 840B dask.array<chunksize=(1, 5, 3), meta=np.ndarray>
    volume_sum_spring      (scipy_dist, id, criterion) float64 840B dask.array<chunksize=(1, 5, 3), meta=np.ndarray>
Attributes:
    cat:frequency:         yr
    cat:processing_level:  indicators
    cat:id:

Enfin, les périodes de retour peuvent être obtenues en utilisant xhfa.local.parametric_quantiles. Les options sont les suivantes :

t : la (les) période(s) de retour en années.
mode: whether the return period is the probability of exceedance ("max") or non-exceedance ("min"). Defaults to "max".

[15]:

rp = xhfa.local.parametric_quantiles(params, t=[20, 100])

rp.load()

In a future release, plotting will be handled by a proper function. For now, we’ll show an example in this Notebook using preliminary utilities.

xhfa.local._prepare_plots generates datapoints required to plot the results of the frequency analysis. If log=True, it will return log-spaced x values between xmin and xmax.

[16]:

data = xhfa.local._prepare_plots(params, xmin=1, xmax=1000, npoints=50, log=True)
data.load()

xhfa.local._get_plotting_positions allows you to get plotting positions for all variables in the dataset. It accepts alpha beta arguments. See the SciPy documentation for typical values. By default, (0.4, 0.4) will be used, which corresponds to approximately quantile unbiased (Cunnane).

[17]:

pp = xhfa.local._get_plotting_positions(ds_4fa[["streamflow_max_spring"]])
pp

[18]:

# Lets plot the observations
p1 = data.streamflow_max_spring.hvplot(
    x="return_period", by="scipy_dist", grid=True, groupby=["id"], logx=True
)

[19]:

# Lets now plot the distributions
p2 = pp.hvplot.scatter(
    x="streamflow_max_spring_pp",
    y="streamflow_max_spring",
    grid=True,
    groupby=["id"],
    logx=True,
)

[20]:

# And now combining the plots
p1 * p2

[20]:

4. Module d’analyse fréquentielle¶

4.1. Extraction et préparation des données¶

4.2. Personnalisation des paramètres d’analyse¶

4.2.1. a) Définition des saisons¶

4.2.2. b) Obtenir des maxima de bloc¶

4.2.3. c) Utilisation de saisons personnalisées par année ou par station¶

4.2.4. d) Calcul des volumes¶

4.3. Analyse fréquentielle locale¶