probly.evaluation.ood.evaluate_ood

probly.evaluation.ood.evaluate_ood(in_distribution: ndarray | list[float], out_distribution: ndarray | list[float], metrics: str | list[str] | None = None) dict[str, float][source]

Unified OOD evaluation API.

Provides backward compatibility while supporting multiple metrics.

Parameters:
  • in_distribution – Scores for in-distribution samples.

  • out_distribution – Scores for out-of-distribution samples.

  • metrics – Metrics to compute. Can be: - None or “auroc”: Returns single AUROC value (backward compatible). - “all”: Returns dict with all available metrics. - list: Returns dict with specified metrics.

Returns:

A dictionary mapping metric names to values. If metrics is None or “auroc”, the dict contains only the “auroc” entry.