probly.evaluation.metrics¶
Collection of performance metrics to evaluate predictions.
Functions
|
Compute the Brier score of the predicted probabilities. |
|
Compute the coverage of set-valued predictions described in [AB21]. |
|
Compute credal set coverage via convex hull [NZD25]. |
|
Compute the efficiency of the set-valued predictions for which the ground truth is covered. |
|
Compute the efficiency of set-valued predictions described in [AB21]. |
|
Compute the expected calibration error (ECE) of the predicted probabilities [GPSW17b]. |
|
Expected Calibration Error (ECE) for binary classifiers. |
|
Compute the log loss of the predicted probabilities. |
|
Compute the spherical score of the predicted probabilities. |
|
Compute the zero-one loss of the predicted probabilities. |
- probly.evaluation.metrics.brier_score(probs, targets)[source]¶
Compute the Brier score of the predicted probabilities.
We assume the score to be negatively-oriented, i.e. lower is better.
- probly.evaluation.metrics.coverage(preds, targets)[source]¶
Compute the coverage of set-valued predictions described in [AB21].
- probly.evaluation.metrics.coverage_convex_hull(probs, targets, **kwargs)[source]¶
Compute credal set coverage via convex hull [NZD25].
The coverage is defined as the proportion of instances whose true distribution is contained in the convex hull. This is computed using linear programming by checking whether the target distribution can be expressed as a convex combination of the predicted distributions.
- Parameters:
- Returns:
The coverage.
- Return type:
cov
- probly.evaluation.metrics.covered_efficiency(preds, targets)[source]¶
Compute the efficiency of the set-valued predictions for which the ground truth is covered.
In the case of a set over classes this is the mean of the number of classes in the set. In the case of a credal set, this is computed by the mean difference between the upper and lower probabilities.
- Parameters:
- Returns:
The efficiency of the set-valued predictions for which the ground truth is covered.
- Return type:
ceff
- probly.evaluation.metrics.efficiency(preds)[source]¶
Compute the efficiency of set-valued predictions described in [AB21].
In the case of a set over classes this is the mean of the number of classes in the set. In the case of a credal set, this is computed by the mean difference between the upper and lower probabilities.
- Parameters:
preds (ndarray) – Predictions of shape (n_instances, n_classes) or (n_instances, n_samples, n_classes).
- Returns:
The efficiency of the set-valued predictions.
- Return type:
eff
- probly.evaluation.metrics.expected_calibration_error(probs, labels, num_bins=10)[source]¶
Compute the expected calibration error (ECE) of the predicted probabilities [GPSW17b].
- Parameters:
- Returns:
The expected calibration error.
- Return type:
ece
- probly.evaluation.metrics.expected_calibration_error_binary(probs, labels, num_bins=10)[source]¶
Expected Calibration Error (ECE) for binary classifiers.
This function works with sigmoid outputs.
probs: shape (N,) or (N, 1) — sigmoid probabilities labels: shape (N,) — binary labels {0,1}
- probly.evaluation.metrics.log_loss(probs, targets)[source]¶
Compute the log loss of the predicted probabilities.
- probly.evaluation.metrics.spherical_score(probs, targets)[source]¶
Compute the spherical score of the predicted probabilities.
We assume the score to be negatively-oriented, i.e. lower is better.