probly.evaluation.active_learning.loop.active_learning_loop¶

probly.evaluation.active_learning.loop.active_learning_loop(model: Estimator, x_train: ndarray, y_train: ndarray, x_test: ndarray, y_test: ndarray, *, query_fn: QueryFn | None = None, metric: MetricFn | str | None = None, pool_size: int = 10, num_samples: int = 1, n_iterations: int = 20, seed: int | None = None) → tuple[ndarray, ndarray, list[float], float][source]¶

Run a pool-based active learning loop and evaluate on a held-out test set.

The initial labeled set is drawn randomly (pool_size samples) from x_train / y_train; the remaining training samples form the unlabeled pool. At each iteration the model is retrained on the growing labeled set, uncertainty is scored on the pool, and the pool_size most uncertain samples are queried and moved to the labeled set. Performance is measured on the fixed x_test / y_test split.

Parameters:

model – A sklearn-compatible estimator with fit and predict. Models that also expose predict_proba automatically use classification uncertainty measures.
x_train – Training pool features, shape (n_train, n_features). Accepts numpy arrays or torch tensors.
y_train – Training pool targets, shape (n_train,).
x_test – Held-out test features used to evaluate performance each iteration.
y_test – Held-out test targets.
query_fn – Uncertainty scoring function with signature (outputs: np.ndarray) -> np.ndarray where outputs has shape (n_instances, n_samples, n_outputs) and the return value has shape (n_instances,). Defaults to margin_sampling() for classifiers (models with predict_proba) and variance_conditional_expectation() for regressors.
metric – Performance metric evaluated on the test set each iteration. Accepts a string ("mse", "mae", "accuracy", "auc") or any callable (y_true, y_pred) -> float. Error metrics ("mse", "mae") are negated so that a higher score always indicates better performance. Defaults to negative MSE.
pool_size – Number of samples in the initial labeled set and the number of samples queried from the pool per iteration.
num_samples – Number of stochastic forward passes for models that support MC sampling via Sampler. Use 1 for deterministic models.
n_iterations – Maximum number of active learning iterations.
seed – Optional random seed for reproducible initial set selection and tie-breaking.

Returns:

Final labeled features, shape (pool_size + n_iterations * pool_size, n_features). y_labeled: Corresponding labels for the labeled set. scores: Per-iteration test-set performance in higher-is-better convention (error metrics are negated). normalized_auc: Normalized AUC of scores; 1.0 = best performance throughout, lower = slower to improve.

Return type:

x_labeled