1,020 research outputs found

    Epistemic uncertainty quantification in deep learning classification by the Delta method

    Get PDF
    The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters . We propose a low cost approximation of the Delta method applicable to -regularized deep neural networks based on the top eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of the exact inverse Hessian, the inverse outer-products of gradients approximation and the so-called Sandwich estimator. Moreover, we provide bounds on the approximation error for the uncertainty of the predictive class probabilities. We show that when the smallest computed eigenvalue of the Fisher information matrix is near the -regularization rate, the approximation error will be close to zero even when . A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet and ResNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.publishedVersio

    Epistemic Uncertainty Quantification in Deep Learning by the Delta Method

    Get PDF
    This thesis explores the Delta method and its application to deep learning image classification. The Delta method is a classical procedure for quantifying uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters P. We recognize the Delta method as a measure of epistemic as opposed to aleatoric uncertainty and break it into two components: the eigenvalue spectrum of the inverse Fisher information (i.e. inverse Hessian) of the cost function and the per-example sensitivities (i.e. gradients) of the model function. We mainly focus on the computational aspects, and show how to efficiently compute low and full-rank approximations of the inverse Fisher information matrix, which in turn reduces the computational complexity of the naïve Delta method from O(P²) space and O(P³) time, to O(P) space and time. We provide bounds for the approximation error by a novel error propagating technique, and validate the developed methodology with a released TensorFlow implementation. By a comparison with the classical Bootstrap, we show that there is a strong linear relationship between the quantified predictive epistemic uncertainty levels obtained from the two methods when applied on a few well known architectures using the MNIST and CIFAR-10 datasets.Doktorgradsavhandlin

    A universal approximate cross-validation criterion and its asymptotic distribution

    Get PDF
    A general framework is that the estimators of a distribution are obtained by minimizing a function (the estimating function) and they are assessed through another function (the assessment function). The estimating and assessment functions generally estimate risks. A classical case is that both functions estimate an information risk (specifically cross entropy); in that case Akaike information criterion (AIC) is relevant. In more general cases, the assessment risk can be estimated by leave-one-out crossvalidation. Since leave-one-out crossvalidation is computationally very demanding, an approximation formula can be very useful. A universal approximate crossvalidation criterion (UACV) for the leave-one-out crossvalidation is given. This criterion can be adapted to different types of estimators, including penalized likelihood and maximum a posteriori estimators, and of assessment risk functions, including information risk functions and continuous rank probability score (CRPS). This formula reduces to Takeuchi information criterion (TIC) when cross entropy is the risk for both estimation and assessment. The asymptotic distribution of UACV and of a difference of UACV is given. UACV can be used for comparing estimators of the distributions of ordered categorical data derived from threshold models and models based on continuous approximations. A simulation study and an analysis of real psychometric data are presented.Comment: 23 pages, 2 figure

    Iterative and Recursive Estimation in Structural Non-Adaptive Models

    Get PDF
    An inference method, called latent backfitting is proposed. It appears well suited for econometric models where the structural relationships of interest define the observed endogenous variables as a known function of unobserved state variables and unknown parameters. This nonlinear state space specification paves the way for iterative or recursive EM-like strategies. In the E-steps the state variables are forecasted given the observations and a value of the parameters. In the M-steps these forecasts are used to deduce estimators of the unknown parameters from the statistical model of latent variables. The proposed iterative/recursive estimation is particularly useful for latent regression models and for dynamic equilibrium models involving latent state variables. Practical implementation issues are discussed through the example of term structure models of interest rates. Nous proposons une méthode d'inférence appelée «latent backfitting». Cette méthode est spécialement conçue pour les modèles économétriques dans lesquels les relations structurelles d'intérêt définissent les variables endogènes observées comme une fonction connue des variables d'états non observées et des paramètres inconnus. Cette spécification espace-état non linéaire ouvre la voie à des stratégies itératives ou récursives de type EM. Dans l'étape E, les variables d'état sont prédites à partir des observations et des valeurs des paramètres. Dans l'étape M, ces prévisions sont utilisées pour déduire des estimateurs des paramètres inconnus à partir du modèle statistique des variables latentes. L'estimation itérative/récursive proposée est particulièrement utile pour les modèles avec équation de régression latente et les modèles dynamiques d'équilibre utilisant des variables d'état latentes. Les questions relatives à l'application de ces méthodes sont analysées à travers l'exemple des modèles de structure par termes des taux d'intérêt.Asset Pricing Models, Latent Variables, Estimation, Iterative or Recursive Algorithms, Modèles d'évaluation d'actifs financiers, variables latentes, estimation, algorithmes itératifs ou récursifs
    • …
    corecore