Metrics¶

Use the right sidebar to navigate.

Base¶

class weatherbenchX.metrics.base.Statistic[source]¶

Abstract base class for statistics.

Statistics are computed for a pair of predictions/targets chunks. The resulting statistics chunks will then be averaged (potentially weighted) across chunks.

The incoming predictions/targets chunks can either be a dictionary of DataArrays or a Dataset.

For univariate metrics, a PerVariableStatistic should be implemented. Multivariate metrics have access to all variables. The output should also be a Mapping from str to xr.DataArray. In other words, the DataArray has to be named.

Statistics are required to assign their own unique name. In the case of additional parameters, these should be in self.unique_name.

Statistics should preserve dimensions that are a) required to compute binnings or weights on and b) over which the (weighted) mean is computed. These will typically be the time dimensions (if chunking is done in time) and/or the spatial/observation dimensions (if these are needed for binning or weighting). Other dimensions can be reduced.

Typically, one or more statistics are assiciated with a metric which then uses the averaged statistic(s) to compute the final metric values.

class weatherbenchX.metrics.base.PerVariableStatistic[source]¶: Abstract base class for statistics that are computed per variable.

class weatherbenchX.metrics.base.Metric[source]¶

Abstract base class for metrics.

Metrics define one or more statistics to be computed. Their names can be chosen freely inside the metric. Before the computation of the metrics from the aggregated statistics, the unique statistic names will be renamed to the internal names. Metrics computed for each variable independently should be implemented as PerVariableMetric classes.

class weatherbenchX.metrics.base.PerVariableMetric[source]¶: Abstract base class for metrics that are computed per variable.

class weatherbenchX.metrics.base.PerVariableStatisticWithClimatology(climatology: Dataset)[source]¶

Base class for per-variable statistics with climatology.

This class provides a convenient way to compute statistics that are a function of both the prediction/target and the climatology. The climatology is aligned with the prediction/target based on the prediction’s valid_time.

Subclasses must implement the _compute_per_variable_with_aligned_climatology method, which takes the predictions, targets, and aligned climatology as arguments.

Init.

Parameters:: climatology – The climatology dataset.

Deterministic¶

Statistics¶

class weatherbenchX.metrics.deterministic.Error[source]¶: Error between predictions and targets.

class weatherbenchX.metrics.deterministic.AbsoluteError[source]¶: Absolute error between predictions and targets.

class weatherbenchX.metrics.deterministic.SquaredError[source]¶: Squared error between predictions and targets.

class weatherbenchX.metrics.deterministic.PredictionPassthrough[source]¶: Simply returns predictions.

class weatherbenchX.metrics.deterministic.TargetPassthrough[source]¶: Simply returns targets.

class weatherbenchX.metrics.deterministic.WindVectorSquaredError(u_name: Sequence[str], v_name: Sequence[str], vector_name: Sequence[str])[source]¶

Computes squared error between two wind components.

SE = (u_pred - u_target) ** 2 + (v_pred - v_target) ** 2

Init.

Parameters:

u_name – Name of the u wind component, e.g. [u_component_of_wind].
v_name – Name of the v wind component, e.g. [v_component_of_wind].
vector_name – Name to give output variable, e.g. [wind].

class weatherbenchX.metrics.deterministic.SquaredPredictionAnomaly(climatology: Dataset)[source]¶

Computes (predictions - climatology)**2.

Init.

Parameters:: climatology – The climatology dataset.

class weatherbenchX.metrics.deterministic.SquaredTargetAnomaly(climatology: Dataset)[source]¶

Computes (targets - climatology)**2.

Init.

Parameters:: climatology – The climatology dataset.

class weatherbenchX.metrics.deterministic.AnomalyCovariance(climatology: Dataset)[source]¶

Computes (predictions - climatology) * (targets - climatology).

Init.

Parameters:: climatology – The climatology dataset.

Metrics¶

class weatherbenchX.metrics.deterministic.Bias[source]¶: Mean error.

class weatherbenchX.metrics.deterministic.MAE[source]¶: Mean absolute error.

class weatherbenchX.metrics.deterministic.MSE[source]¶

Mean squared error.

Note that if applied to probability forecasts, this is the Brier Score.

class weatherbenchX.metrics.deterministic.RMSE[source]¶: Root mean squared error.

class weatherbenchX.metrics.deterministic.PredictionAverage[source]¶: Average prediction values.

class weatherbenchX.metrics.deterministic.TargetAverage[source]¶: Average target values.

class weatherbenchX.metrics.deterministic.WindVectorRMSE(u_name: str | list[str], v_name: str | list[str], vector_name: str | list[str])[source]¶

Computes vector RMSE between two wind components.

Init.

Args can be a single string or a list, in which case the statistic will be computed separately for the different elements in the list. For example, u_name=[‘u_component_of_wind’, ‘10m_u_component_of_wind_10m’].

Parameters:

u_name – Name of the u wind component, e.g. u_component_of_wind.
v_name – Name of the v wind component, e.g. v_component_of_wind.
vector_name – Name to give output variable, e.g. wind.

class weatherbenchX.metrics.deterministic.ACC(climatology: Dataset)[source]¶: Anomaly correlation coefficient.

class weatherbenchX.metrics.deterministic.PredictionActivity(climatology: Dataset)[source]¶

Activity in predictions defined as the std dev of the prediction anomalies.

This is used e.g. by ECMWF: https://arxiv.org/abs/2307.10128

Probabilistic¶

Statistics¶

class weatherbenchX.metrics.probabilistic.CRPSSkill(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶: The skill measure associated with CRPS, E|X - Y|.

class weatherbenchX.metrics.probabilistic.CRPSSpread(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶: The spread measure associated with CRPS, E|X - X`|.

class weatherbenchX.metrics.probabilistic.EnsembleVariance(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶: Computes the variance in the ensemble dimension.

class weatherbenchX.metrics.probabilistic.UnbiasedEnsembleMeanSquaredError(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶

Computes the unbiased ensemble mean squared error.

This class estimates E(X - Y)² with no bias. This is done by subtracting the sample variance divided by n. As such, you must have n > 1 or the result will be NaN.

Metrics¶

class weatherbenchX.metrics.probabilistic.CRPSEnsemble(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶

Continuous ranked probabilisty score for an ensemble prediction.

Given ground truth scalar random variable Y, and two iid predictions X, X`, the Continuously Ranked Probability Score is defined as

CRPS = E|X - Y| - 0.5 * E|X - X`|

where E is mathematical expectation, and | ⋅ | is the absolute value. CRPS has a unique minimum when X is distributed the same as Y.

If N ensemble members are available, the ensemble mean is taken using the PWM method from [Zamo & Naveau, 2018].

So long as 2 or more ensemble members are used, the estimates of spread, skill and CRPS are unbiased at each time.

References:

[Gneiting & Raftery, 2012], Strictly Proper Scoring Rules, Prediction, and Estimation
[Zamo & Naveau, 2018], Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts.

Init.

Parameters:

ensemble_dim – Name of the ensemble dimension. Default: ‘number’.
skipna_ensemble – If True, ensemble members with NaN values will be ignored in the ensemble mean computations. Default: False.

class weatherbenchX.metrics.probabilistic.UnbiasedEnsembleMeanRMSE[source]¶: Unbiased estimate of the ensemble mean RMSE.

class weatherbenchX.metrics.probabilistic.SpreadSkillRatio(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶

Computes the (biased) spread-skill ratio.

The spread skill ratio is defined as the ensemble standard deviation divided by the RMSE of the ensemble mean.

Init.

Parameters:

ensemble_dim – Name of the ensemble dimension. Default: ‘number’.
skipna_ensemble – If True, ensemble members with NaN values will be ignored in the ensemble mean computations. Default: False.

class weatherbenchX.metrics.probabilistic.UnbiasedSpreadSkillRatio(ensemble_dim: str = 'number', skipna_ensemble: bool = False)[source]¶

Computes the spread-skill ratio based on the unbiased skill estimator.

This is analogous to the regular spread skill ratio but using the unbiased estimator of the ensemble mean squared error. This is useful for estimating the spread skill ratio for differing ensemble sizes.

Note that the ratio and square root are still biased, however, this is negligible if the number of time points is large.

Init.

Parameters:

ensemble_dim – Name of the ensemble dimension. Default: ‘number’.
skipna_ensemble – If True, ensemble members with NaN values will be ignored in the ensemble mean computations. Default: False.

Categorical¶

Statistics¶

class weatherbenchX.metrics.categorical.TruePositives[source]¶: True positives from binary predictions and targets.

class weatherbenchX.metrics.categorical.TrueNegatives[source]¶: True negatives from binary predictions and targets.

class weatherbenchX.metrics.categorical.FalsePositives[source]¶: False positives from binary predictions and targets.

class weatherbenchX.metrics.categorical.FalseNegatives[source]¶: False negatives from binary predictions and targets.

class weatherbenchX.metrics.categorical.SEEPSStatistic(variables: Sequence[str], climatology: Dataset, dry_threshold_mm: float | Sequence[float] = 0.25, min_p1: float | Sequence[float] = 0.1, max_p1: float | Sequence[float] = 0.85)[source]¶: Computes SEEPS statistic. See metric class for details.

Metrics¶

class weatherbenchX.metrics.categorical.CSI[source]¶

Critical Success Index.

Also called Threat Score (TS).

CSI = (TP / (TP + FP + FN)).

class weatherbenchX.metrics.categorical.Accuracy[source]¶

Accuracy.

ACC = (TP + TN) / (TP + FP + FN + TN).

class weatherbenchX.metrics.categorical.Recall[source]¶

Also called True Positive Rate (TPR) or Sensitivity.

Recall = TP / (TP + FN).

class weatherbenchX.metrics.categorical.Precision[source]¶

Also called Positive Predictive Value (PPV).

Precision = TP / (TP + FP).

class weatherbenchX.metrics.categorical.F1Score[source]¶

F1 score.

F1 = 2 * Precision * Recall / (Precision + Recall): = 2 * TP / (2 * TP + FP + FN).

class weatherbenchX.metrics.categorical.FrequencyBias[source]¶

Frequency bias.

FB = PP / P = (TP + FP) / (TP + FN)

class weatherbenchX.metrics.categorical.SEEPS(variables: Sequence[str], climatology: Dataset, dry_threshold_mm: float | Sequence[float] = 0.25, min_p1: float | Sequence[float] = 0.1, max_p1: float | Sequence[float] = 0.85)[source]¶

Computes Stable Equitable Error in Probability Space.

Definition in Rodwell et al. (2010): https://www.ecmwf.int/en/elibrary/76205-new-equitable-score-suitable-verifying-precipitation-nwp

Important: In most cases, the statistic will contain NaNs because of the masking of high and low p1 values. For this reason, a mask coordinate will be added to the resulting statistic to be used in combination with masked=True in the aggregator. If a mask already exists in either the predictions or targets, it will be combined with the p1 mask.

Init.

Parameters:

variables – List of precipitation variables to compute SEEPS for.
climatology – Climatology containing *_seeps_dry_fraction and *_seeps_threshold for each of the precipitation variables with dimensions dayofyear and hour, as well as latitude and longitude corresponding to the predictions/targets coordinates, see example below.
dry_threshold_mm – Values smaller or equal are considered dry. Unit: mm. Can be list for each variable. Must be same length. Default: 0.25
min_p1 – Mask out p1 values below this threshold. Can be list for each variable. Default: 0.1
max_p1 – Mask out p1 values above this threshold. Can be list for each variable. Default: 0.85

Example

>>> climatology
<xarray.Dataset> Size: 24MB
Dimensions:                                     (hour: 4, dayofyear: 366,
                                                longitude: 64, latitude: 32)
Coordinates:
  * dayofyear                                   (dayofyear) int64 3kB 1 ... 366
  * hour                                        (hour) int64 32B 0 6 12 18
  * latitude                                    (latitude) float64 256B -87.1...
  * longitude                                   (longitude) float64 512B 0.0 ...
Data variables:
    total_precipitation_6hr_seeps_dry_fraction  (hour, dayofyear, longitude, latitude) ...
    total_precipitation_6hr_seeps_threshold     (hour, dayofyear, longitude, latitude) ...

Spatial¶

Statistics¶

class weatherbenchX.metrics.spatial.SquaredFractionsError(neighborhood_size_in_pixels: int | Iterable[int], wrap_longitude: bool = False)[source]¶: Numerator of the FSS.

class weatherbenchX.metrics.spatial.SquaredPredictionFraction(neighborhood_size_in_pixels: int | Iterable[int], wrap_longitude: bool = False)[source]¶: One part of the denominator of the FSS.

class weatherbenchX.metrics.spatial.SquaredTargetFraction(neighborhood_size_in_pixels: int | Iterable[int], wrap_longitude: bool = False)[source]¶: One part of the denominator of the FSS.

Metrics¶

class weatherbenchX.metrics.spatial.FSS(neighborhood_size_in_pixels: int | Iterable[int], wrap_longitude: bool = False)[source]¶

Implementation of the Fractions Skill Score (FSS).

Assumes the input data is already binary. The FSS is defined by a square neighborhood size in pixels. On a lat-lon grid this can lead to distorted neighborhoods towards the poles.

Original paper: Roberts and Lean, 2008. https://doi.org/10.1175/2007MWR2123.1

More recent overvew paper, including discussion of how to compute the FSS over multiple forecasts: https://journals.ametsoc.org/view/journals/mwre/149/10/MWR-D-18-0106.1.xml

Note that if there is no rain in the aggregated targets and predictions, the FSS is undfined (NaN).

neighborhood_size_in_pixels¶

The size of the neighborhood to use for averaging in pixels. Must be odd. Can be an integer or a list, in which case the result will have an additional dimension ‘neighborhood_size’.

Type:: int | Iterable[int]

wrap_longitude¶

If True, averaging operation wraps around longitude. Default: False.

Type:: bool

Wrappers¶

class weatherbenchX.metrics.wrappers.InputTransform(which)[source]¶

Base class for input transformations.

Init.

Parameters:: which – Which input to apply the wrapper to. Must be one of ‘predictions’, ‘targets’, or ‘both’.

class weatherbenchX.metrics.wrappers.EnsembleMean(which, ensemble_dim='number', skipna=False)[source]¶

Compute ensemble mean.

Init.

Parameters:

which – Which input to apply the wrapper to. Must be one of ‘predictions’, ‘targets’, or ‘both’.
ensemble_dim – Name of ensemble dimension. Default: ‘number’.
skipna – If True, skip NaNs in the ensemble mean. Default: False.

class weatherbenchX.metrics.wrappers.ContinuousToBinary(which: str, threshold_value: float | Iterable[float], threshold_dim: str)[source]¶

Converts a continuous input to a binary one.

Applies x > threshold for all threholds and concatenates along a new dimension of name threshold_dim.

Init.

Parameters:

which – Which input to apply the wrapper to. Must be one of ‘predictions’, ‘targets’, or ‘both’.
threshold_value – Threshold value or list of values.
threshold_dim – Name of dimension to use for threshold values.

class weatherbenchX.metrics.wrappers.WrappedStatistic(statistic: Statistic, transform: InputTransform)[source]¶

Wraps a statistic with an input transform.

Also adds suffix to unique name.

Init.

Parameters:

statistic – Statistic object to wrap.
transform – Transform to apply to inputs.

class weatherbenchX.metrics.wrappers.WrappedMetric(metric: Metric, transforms: list[InputTransform])[source]¶

Wraps all statistics of a metric with input transforms.

Init.

Parameters:

metric – Metric to wrap.
transforms – List of input transforms to apply. The transforms will be applied in the order they are listed, i.e. the first transform in the list will be applied first.