Measuring Machine Learning Model Uncertainty with Applications to Aerial Segmentation

Abstract

Machine learning model performance on both validation data and new data can be better measured and understood by leveraging uncertainty metrics at the time of prediction. These metrics can improve the model training process by indicating which training data need to be corrected and what part of the domain needs further annotation. The methods described have yet to reach mainstream adoption, and show great potential. Here, we survey the field of uncertainty metrics and provide a robust framework for its application to aerial segmentation. Uncertainty is divided into two types: aleatoric and epistemic. Aleatoric uncertainty arises from variations in training data and can be the result of poor training data or an inherently stochastic observation. Epistemic uncertainty arises from predicting on inputs that are out of class of the training data. Both measures inform the machine learning engineer on what areas of data need better or more training, and also help downstream processes quantify the usefulness of a prediction. We survey the current tools for measuring uncertainty, including the autoencoder for measuring epistemic uncertainty and the Bayesian neural network. The latter replaces each trained weight with a random variable with which we approximate the true, unknown distribution of each weight with a two parameter (mean and variance) normal distribution. Bayes by Backprop trains these parameters by minimizing the Kullback–Leibler divergence between the approximating normal distribution and the unknown distribution. Our contribution is a novel application of the Bayesian neural network with Gaussian weights applied to the U-Net model for aerial segmentation. Using the DroneDeploy dataset, we build and train our Bayesian U-Net model and gather epistemic and aleatoric uncertainty metrics. Experimentally, we find that these metrics are correlated to model performance on unseen data and thus provide immediate value to a modeling and prediction workflow. We show the usefulness of these metrics for both per-pixel uncertainty estimation and per-image uncertainty estimation

    Similar works