probly.train.evidential.torch¶
Unified Evidential Train Function.
Functions
|
Deep Evidential Regression loss for uncertainty-aware regression. |
|
Dirichlet entropy for predictive uncertainty estimation. |
|
Evidential Cross Entropy Loss for classification uncertainty estimation. |
|
Evidential KL divergence loss for classification uncertainty estimation. |
|
Evidential Log Loss for classification uncertainty estimation. |
|
Evidential Mean Squared Error loss for classification uncertainty estimation. |
|
Evidence-based Normal-Inverse-Gamma (NIG) regression loss. |
|
Regularization term for evidential regression. |
|
Information Robust Dirichlet (IRD) loss for predictive uncertainty estimation. |
|
Compute KL(Dir(alpha_p) || Dir(alpha_q)) for each batch item. |
|
Lp calibration loss for predictive uncertainty estimation. |
Construct target Dirichlet distribution for in-distribution samples. |
|
|
Construct flat Dirichlet target distribution for out-of-distribution samples. |
|
Natural Posterior Network (NatPN) classification loss. |
|
Compute simplified univariate Normal-Wishart log-likelihood. |
|
Paired ID/OOD training loss for Dirichlet Prior Networks. |
|
Posterior Networks (PostNet) classification loss. |
|
Expected categorical probabilities under Dirichlet. |
|
Regularization term for Information Robust Dirichlet Networks. |
|
Compute the distillation loss for Regression Prior Networks (RPN). |
|
Paired in-distribution and out-of-distribution loss for Regression Prior Networks. |
|
KL divergence between two Normal-Gamma distributions. |
|
Normal-Gamma prior with zero evidence for Regression Prior Networks. |
|
Trains a given Neural Network using different learning approaches, depending on the approach of a selected paper. |
- probly.train.evidential.torch.der_loss(y, mu, kappa, alpha, beta, lam=0.01)[source]¶
Deep Evidential Regression loss for uncertainty-aware regression.
Combines a Student-t negative log-likelihood with an evidence regularization term as proposed by Amini et al. (2020).
- Reference:
Amini et al., “Deep Evidential Regression”, NeurIPS 2020. https://arxiv.org/abs/1910.02600
- Parameters:
y (Tensor) – Ground-truth regression targets, shape (B,) or (B, 1).
mu (Tensor) – Predicted mean of the Normal-Inverse-Gamma distribution, shape (B,).
kappa (Tensor) – Predicted scaling parameter, shape (B,).
alpha (Tensor) – Predicted shape parameter, shape (B,).
beta (Tensor) – Predicted scale parameter, shape (B,).
lam (float) – Weight of the evidence regularization term.
- Returns:
Scalar Deep Evidential Regression loss averaged over the batch.
- Return type:
- probly.train.evidential.torch.dirichlet_entropy(alpha)[source]¶
Dirichlet entropy for predictive uncertainty estimation.
Used in Information Robust Dirichlet Networks to encourage uncertainty on adversarial or out-of-distribution inputs by maximizing the entropy of the Dirichlet distribution.
- Reference:
Tsiligkaridis, “Information Robust Dirichlet Networks for Predictive Uncertainty Estimation”, 2019. https://arxiv.org/abs/1910.04819
- The entropy is given by:
- H(alpha) = log B(alpha)
(alpha_0 - K) * ψ(alpha_0)
Σ_k (alpha_k - 1) * ψ(alpha_k)
- Parameters:
alpha (Tensor) – Dirichlet concentration parameters, shape (B_a, K), must be > 0.
- Returns:
Scalar Dirichlet entropy summed over the batch.
- Raises:
ValueError – If
alphacontains non-positive values.- Return type:
- probly.train.evidential.torch.evidential_ce_loss(alphas, targets)[source]¶
Evidential Cross Entropy Loss for classification uncertainty estimation.
Implements the evidential cross-entropy loss proposed by Sensoy et al. (2018) for Evidential Deep Learning.
- Reference:
Sensoy et al., “Evidential Deep Learning to Quantify Classification Uncertainty”, NeurIPS 2018. https://arxiv.org/abs/1806.01768
- probly.train.evidential.torch.evidential_kl_divergence(alphas, targets)[source]¶
Evidential KL divergence loss for classification uncertainty estimation.
Implements the KL divergence regularization term proposed by Sensoy et al. (2018) for Evidential Deep Learning.
- Reference:
Sensoy et al., “Evidential Deep Learning to Quantify Classification Uncertainty”, NeurIPS 2018. https://arxiv.org/abs/1806.01768
- probly.train.evidential.torch.evidential_log_loss(alphas, targets)[source]¶
Evidential Log Loss for classification uncertainty estimation.
Implements the evidential log loss proposed by Sensoy et al. (2018) for Evidential Deep Learning.
- Reference:
Sensoy et al., “Evidential Deep Learning to Quantify Classification Uncertainty”, NeurIPS 2018. https://arxiv.org/abs/1806.01768
- probly.train.evidential.torch.evidential_mse_loss(alphas, targets)[source]¶
Evidential Mean Squared Error loss for classification uncertainty estimation.
Implements the evidential MSE loss proposed by Sensoy et al. (2018), combining prediction error and predictive variance under a Dirichlet distribution.
- Reference:
Sensoy et al., “Evidential Deep Learning to Quantify Classification Uncertainty”, NeurIPS 2018. https://arxiv.org/abs/1806.01768
- probly.train.evidential.torch.evidential_nignll_loss(inputs, targets)[source]¶
Evidence-based Normal-Inverse-Gamma (NIG) regression loss.
Implements the negative log-likelihood term used in Deep Evidential Regression as proposed by Amini et al. (2020).
- Reference:
Amini et al., “Deep Evidential Regression”, NeurIPS 2020. https://arxiv.org/abs/1910.02600
- Parameters:
- Returns:
Scalar NIG negative log-likelihood loss averaged over the batch.
- Return type:
- probly.train.evidential.torch.evidential_regression_regularization(inputs, targets)[source]¶
Regularization term for evidential regression.
Implements the evidence regularization component proposed by Amini et al. (2020) to penalize confident but inaccurate predictions in Deep Evidential Regression.
- Reference:
Amini et al., “Deep Evidential Regression”, NeurIPS 2020. https://arxiv.org/abs/1910.02600
- Parameters:
- Returns:
Scalar evidential regression regularization loss averaged over the batch.
- Return type:
- probly.train.evidential.torch.ird_loss(alpha, y, adversarial_alpha=None, p=2.0, lam=1.0, gamma=1.0, normalize=True)[source]¶
Information Robust Dirichlet (IRD) loss for predictive uncertainty estimation.
Implements the loss proposed by Tsiligkaridis (2019), combining an Lp calibration term, a trigamma-based regularization term, and an optional entropy-based adversarial regularizer.
- Reference:
Tsiligkaridis, “Information Robust Dirichlet Networks for Predictive Uncertainty Estimation”, 2019. https://arxiv.org/abs/1910.04819
- Parameters:
alpha (Tensor) – Dirichlet concentration parameters, shape (B, K).
y (Tensor) – One-hot encoded class labels, shape (B, K).
adversarial_alpha (Tensor | None) – Dirichlet concentration parameters for adversarial inputs, shape (B_a, K).
p (float) – Lp norm exponent controlling calibration strength.
lam (float) – Weight of the regularization term.
gamma (float) – Weight of the entropy regularization term.
normalize (bool) – Whether to normalize loss terms by batch size.
- Returns:
Scalar IRD loss summed over all input examples.
- Return type:
- probly.train.evidential.torch.kl_dirichlet(prior_alpha, posterior_alpha)[source]¶
Compute KL(Dir(alpha_p) || Dir(alpha_q)) for each batch item.
Used by Posterior Networks, Dirichlet Prior Networks, and PN-style in-distribution / out-of-distribution losses to compare Dirichlet distributions.
- probly.train.evidential.torch.lp_fn(alpha, y, p=2.0)[source]¶
Lp calibration loss for predictive uncertainty estimation.
Implements the Lp calibration loss proposed by Tsiligkaridis (2019) for Information Robust Dirichlet Networks.
- Reference:
Tsiligkaridis, “Information Robust Dirichlet Networks for Predictive Uncertainty Estimation”, 2019. https://arxiv.org/abs/1910.04819
- The loss is computed using the expectation-based formulation:
F_i = ( E[(1 - p_c)^p] + Σ_{j≠c} E[p_j^p] )^(1/p)
- Parameters:
- Returns:
Scalar Lp calibration loss summed over the batch.
- Raises:
ValueError – If
alphacontains non-positive values or if shapes do not match.- Return type:
- probly.train.evidential.torch.make_in_domain_target_alpha(y)[source]¶
Construct target Dirichlet distribution for in-distribution samples.
Used by Dirichlet Prior Networks, Posterior Networks, and PN-style paired losses to create a sharp (peaked) Dirichlet target for supervised in-distribution training.
- probly.train.evidential.torch.make_ood_target_alpha(batch_size, num_classes=10, alpha0=10)[source]¶
Construct flat Dirichlet target distribution for out-of-distribution samples.
Used by Dirichlet Prior Networks, Posterior Networks, and PN-style paired losses to encourage high uncertainty on out-of-distribution inputs by assigning uniform Dirichlet concentration parameters.
- probly.train.evidential.torch.natpn_loss(alpha, y, entropy_weight=0.0001)[source]¶
Natural Posterior Network (NatPN) classification loss.
Implements the Dirichlet-Categorical Bayesian loss with an entropy regularizer as proposed by Charpentier et al. (2022).
- Reference:
Charpentier et al., “Natural Posterior Network”, NeurIPS 2022. https://arxiv.org/abs/2105.04471
- Parameters:
- Returns:
Scalar NatPN loss averaged over the batch.
- Return type:
- probly.train.evidential.torch.normal_wishart_log_prob(m, l_precision, kappa, nu, mu_k, sigma2_k)[source]¶
Compute simplified univariate Normal-Wishart log-likelihood.
- Parameters:
m (Tensor) – Prior mean parameter.
l_precision (Tensor) – Precision (> 0), formerly L.
kappa (Tensor) – Strength parameter (> 0).
nu (Tensor) – Degrees of freedom (> 2).
mu_k (Tensor) – Sample mean from ensemble.
sigma2_k (Tensor) – Sample variance from ensemble.
- Returns:
Log-likelihood under the Normal-Wishart model.
- Return type:
Tensor
- probly.train.evidential.torch.pn_loss(model, x_in, y_in, x_ood)[source]¶
Paired ID/OOD training loss for Dirichlet Prior Networks.
Combines KL divergence to sharp in-distribution targets and flat out-of-distribution targets, with an additional cross-entropy term for classification stability.
- Reference:
Malinin and Gales, “Predictive Uncertainty Estimation via Prior Networks”, NeurIPS 2018. https://arxiv.org/abs/1802.10501
- Parameters:
- Returns:
Scalar paired ID+OOD Prior Networks loss.
- Return type:
- probly.train.evidential.torch.postnet_loss(alpha, y, entropy_weight=1e-05)[source]¶
Posterior Networks (PostNet) classification loss.
Implements the expected cross-entropy loss with an entropy regularizer as proposed by Charpentier et al. (2020) for Posterior Networks.
- Reference:
Charpentier et al., “Posterior Networks: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts”, NeurIPS 2020. https://arxiv.org/abs/2006.09239
- probly.train.evidential.torch.predictive_probs(alpha)[source]¶
Expected categorical probabilities under Dirichlet.
Used by Posterior Networks, Dirichlet Prior Networks, and other Dirichlet-based classification models to obtain predictive class probabilities.
- probly.train.evidential.torch.regularization_fn(alpha, y)[source]¶
Regularization term for Information Robust Dirichlet Networks.
Penalizes high Dirichlet concentration values for incorrect classes to encourage confident but well-calibrated predictions.
- Reference:
Tsiligkaridis, “Information Robust Dirichlet Networks for Predictive Uncertainty Estimation”, 2019. https://arxiv.org/abs/1910.04819
- Parameters:
- Returns:
Scalar regularization loss summed over classes and batch.
- Raises:
ValueError – If
alphaandyshapes do not match.- Return type:
- probly.train.evidential.torch.rpn_distillation_loss(rpn_params, mus, variances)[source]¶
Compute the distillation loss for Regression Prior Networks (RPN).
This loss measures how well the RPN’s Normal-Wishart distribution matches the empirical ensemble distributions (mu_k, var_k).
- probly.train.evidential.torch.rpn_loss(model, x_id, y_id, x_ood, lam_der=0.01, lam_rpn=50.0)[source]¶
Paired in-distribution and out-of-distribution loss for Regression Prior Networks.
Computes the Regression Prior Network (RPN) training objective using paired in-distribution (ID) and out-of-distribution (OOD) mini-batches. The loss combines a supervised Deep Evidential Regression (DER) term on ID data with a KL regularization term that pushes OOD predictions back toward the Normal-Gamma prior.
- Reference:
Malinin et al., “Regression Prior Networks”, NeurIPS 2020. https://arxiv.org/abs/2006.11590
- Parameters:
model (Module) – Regression model returning (mu, kappa, alpha, beta) for each input.
x_id (Tensor) – In-distribution inputs, shape (B_id, …).
y_id (Tensor) – In-distribution regression targets, shape (B_id,) or compatible.
x_ood (Tensor) – Out-of-distribution inputs, shape (B_ood, …).
lam_der (float) – Weight of the DER evidence regularization term.
lam_rpn (float) – Weight of the RPN prior-matching KL term.
- Returns:
Scalar paired ID+OOD Regression Prior Network loss.
- Return type:
- probly.train.evidential.torch.rpn_ng_kl(mu, kappa, alpha, beta, mu0, kappa0, alpha0, beta0)[source]¶
KL divergence between two Normal-Gamma distributions.
Computes the KL divergence between a predicted Normal-Gamma distribution and a prior Normal-Gamma distribution, as used in Regression Prior Networks to regularize out-of-distribution predictions.
- Reference:
Malinin et al., “Regression Prior Networks”, NeurIPS 2020. https://arxiv.org/abs/2006.11590
- Parameters:
mu (Tensor) – Predicted mean parameter, shape (B,).
kappa (Tensor) – Predicted scaling parameter, shape (B,).
alpha (Tensor) – Predicted shape parameter, shape (B,).
beta (Tensor) – Predicted scale parameter, shape (B,).
mu0 (Tensor) – Prior mean parameter, shape (B,).
kappa0 (Tensor) – Prior scaling parameter, shape (B,).
alpha0 (Tensor) – Prior shape parameter, shape (B,).
beta0 (Tensor) – Prior scale parameter, shape (B,).
- Returns:
Scalar KL divergence between predicted and prior Normal-Gamma distributions, averaged over the batch.
- Return type:
- probly.train.evidential.torch.rpn_prior(shape, device)[source]¶
Normal-Gamma prior with zero evidence for Regression Prior Networks.
Constructs an uninformative Normal-Gamma prior used in Regression Prior Networks to regularize out-of-distribution predictions via KL divergence, as proposed by Malinin et al. (2020).
- Reference:
Malinin et al., “Regression Prior Networks”, NeurIPS 2020. https://arxiv.org/abs/2006.11590
- Parameters:
- Returns:
Tuple
(mu0, kappa0, alpha0, beta0)of Normal-Gamma prior parameters, each with the specified shape.- Return type:
- probly.train.evidential.torch.unified_evidential_train(mode, model, dataloader, loss_fn=None, oodloader=None, class_count=None, epochs=5, lr=0.001, device='cpu')[source]¶
Trains a given Neural Network using different learning approaches, depending on the approach of a selected paper.
- Parameters:
mode (Literal['PostNet', 'NatPostNet', 'EDL', 'PrNet', 'IRD', 'DER', 'RPN']) – Identifier of the paper-based training approach to be used. Must be one of: “PostNet”, “NatPostNet”, “EDL”, “PrNet”, “IRD”, “DER” or “RPN”.
model (nn.Module) – The neural network to be trained.
dataloader (DataLoader) – Pytorch.Dataloader providing the In-Distributtion training samples and corresponding labels.
loss_fn (Callable[..., torch.Tensor] | None) – Loss functions used for training. The inputs of each loss-functions depends on the selected mode
oodloader (DataLoader | None) – Pytorch.Dataloader providing the Out-Of-Distributtion training samples and corresponding labels. This is only required for certain modes such as “PrNet”
class_count (torch.Tensor | None) – Tensor containing the number of samples per class.
epochs (int) – Number of training epochs.
lr (float) – Learning rate used by the optimizer.
device (str) – Device on which the model is trained (e.g. “cpu” or “cuda”)
- Returns:
None. The function performs training of the provided model and does not return a value. But prints the total-losses per Epoch.
- Return type:
None