probly.evaluation.ood¶
Unified OOD evaluation API for probly.
Functions
|
Unified OOD evaluation API. |
Perform out-of-distribution detection using AUPR (Area Under the Precision-Recall Curve). |
|
Perform out-of-distribution detection using prediction functionals from id and ood data. |
|
Perform out-of-distribution detection using false negative rate at user given true positive rate. |
|
Perform out-of-distribution detection using false positive rate (FPR) at a given true positive rate. |
|
|
Parse dynamic metric specification. |
|
Generate visualization plots from OOD scores. |
- probly.evaluation.ood.evaluate_ood(in_distribution, out_distribution, metrics=None)[source]¶
Unified OOD evaluation API.
Provides backward compatibility while supporting multiple metrics.
Parameters: in_distribution :
Scores for in-distribution samples.
- out_distribution :
Scores for out-of-distribution samples.
- metricsstr, list of str, or None
None or “auroc”: Returns single AUROC value (backward compatible)
“all”: Returns dict with all available metrics
list: Returns dict with specified metrics
Returns: dict[str, float]
Dictionary mapping metric names to values. If metrics is None or “auroc”, the dict contains only the “auroc” entry.
- probly.evaluation.ood.out_of_distribution_detection_aupr(in_distribution, out_distribution)[source]¶
Perform out-of-distribution detection using AUPR (Area Under the Precision-Recall Curve).
This metric evaluates how well the model distinguishes between in- and out-of-distribution samples, focusing more on positive class (OOD) precision and recall.
- probly.evaluation.ood.out_of_distribution_detection_auroc(in_distribution, out_distribution)[source]¶
Perform out-of-distribution detection using prediction functionals from id and ood data.
This can be epistemic uncertainty, as is common, but also e.g. softmax confidence.
- probly.evaluation.ood.out_of_distribution_detection_fnr_at_x_tpr(in_distribution, out_distribution, tpr_target=0.95)[source]¶
Perform out-of-distribution detection using false negative rate at user given true positive rate.
If no thresholds are specified, the default tpr_target is 0.95.
- probly.evaluation.ood.out_of_distribution_detection_fpr_at_x_tpr(in_distribution, out_distribution, tpr_target=0.95)[source]¶
Perform out-of-distribution detection using false positive rate (FPR) at a given true positive rate.
If no thresholds are specified, the default tpr_target is 0.95.
This can be epistemic uncertainty, as is common, but also e.g. softmax confidence.
- Parameters:
- Returns:
float, FPR at the first threshold where TPR >= tpr_target
- Return type:
fpr_at_target
Notes
Assumes that larger scores correspond to the positive class (out-of-distribution).
If tpr_target cannot be reached, a ValueError is raised.
- probly.evaluation.ood.parse_dynamic_metric(spec)[source]¶
Parse dynamic metric specification.
Examples
fpr@0.8 fnr@95% fpr -> default threshold is 0.95 fnr -> default threshold is 0.95
- probly.evaluation.ood.visualize_ood(in_distribution, out_distribution, plot_types=None, invert_scores=True)[source]¶
Generate visualization plots from OOD scores.
Parameters: in_distribution :
Scores for in-distribution samples.
- out_distribution :
Scores for out-of-distribution samples.
- plot_typeslist[str], optional
List of specific plots to return (e.g. [‘roc’, ‘hist’, ‘pr’]). If None, all plots are generated.
- invert_scoresbool
If True (default), assumes scores are ‘Confidence’ (High = ID). They will be inverted (1.0 - score) for metrics where OOD is the positive class. If False, assumes scores are ‘Anomaly Scores’ (High = OOD).