25 research outputs found

    Antithetic and Monte Carlo kernel estimators for partial rankings

    Get PDF
    In the modern age, rankings data is ubiquitous and it is useful for a variety of applications such as recommender systems, multi-object tracking and preference learning. However, most rankings data encountered in the real world is incomplete, which prevents the direct application of existing modelling tools for complete rankings. Our contribution is a novel way to extend kernel methods for complete rankings to partial rankings, via consistent Monte Carlo estimators for Gram matrices: matrices of kernel values between pairs of observations. We also present a novel variance reduction scheme based on an antithetic variate construction between permutations to obtain an improved estimator for the Mallows kernel. The corresponding antithetic kernel estimator has lower variance and we demonstrate empirically that it has a better performance in a variety of Machine Learning tasks. Both kernel estimators are based on extending kernel mean embeddings to the embedding of a set of full rankings consistent with an observed partial ranking. They form a computationally tractable alternative to previous approaches for partial rankings data. An overview of the existing kernels and metrics for permutations is also provided

    Efficient Bayesian inference via Monte Carlo and machine learning algorithms

    Get PDF
    Mención Internacional en el título de doctorIn many fields of science and engineering, we are faced with an inverse problem where we aim to recover an unobserved parameter or variable of interest from a set of observed variables. Bayesian inference is a probabilistic approach for inferring this unknown parameter that has become extremely popular, finding application in myriad problems in fields such as machine learning, signal processing, remote sensing and astronomy. In Bayesian inference, all the information about the parameter is summarized by the posterior distribution. Unfortunately, the study of the posterior distribution requires the computation of complicated integrals, that are analytically intractable and need to be approximated. Monte Carlo is a huge family of sampling algorithms for performing optimization and numerical integration that has become the main horsepower for carrying out Bayesian inference. The main idea of Monte Carlo is that we can approximate the posterior distribution by a set of samples, obtained by an iterative process that involves sampling from a known distribution. Markov chain Monte Carlo (MCMC) and importance sampling (IS) are two important groups of Monte Carlo algorithms. This thesis focuses on developing and analyzing Monte Carlo algorithms (either MCMC, IS or combination of both) under different challenging scenarios presented below. In summary, in this thesis we address several important points, enumerated (a)–(f), that currently represent a challenge in Bayesian inference via Monte Carlo. A first challenge that we address is the problematic exploration of the parameter space by off-the-shelf MCMC algorithms when there is (a) multimodality, or with (b) highly concentrated posteriors. Another challenge that we address is the (c) proposal construction in IS. Furtheremore, in recent applications we need to deal with (d) expensive posteriors, and/or we need to handle (e) noisy posteriors. Finally, the Bayesian framework also offers a way of comparing competing hypothesis (models) in a principled way by means of marginal likelihoods. Hence, a task that arises as of fundamental importance is (f) marginal likelihood computation. Chapters 2 and 3 deal with (a), (b), and (c). In Chapter 2, we propose a novel population MCMC algorithm called Parallel Metropolis-Hastings Coupler (PMHC). PMHC is very suitable for multimodal scenarios since it works with a population of states, instead of a single one, hence allowing for sharing information. PMHC combines independent exploration by the use of parallel Metropolis-Hastings algorithms, with cooperative exploration by the use of a population MCMC technique called Normal Kernel Coupler. In Chapter 3, population MCMC are combined with IS within the layered adaptive IS (LAIS) framework. The combination of MCMC and IS serves two purposes. First, an automatic proposal construction. Second, it aims at increasing the robustness, since the MCMC samples are not used directly to form the sample approximation of the posterior. The use of minibatches of data is proposed to deal with highly concentrated posteriors. Other extensions for reducing the costs with respect to the vanilla LAIS framework, based on recycling and clustering, are discussed and analyzed. Chapters 4, 5 and 6 deal with (c), (d) and (e). The use of nonparametric approximations of the posterior plays an important role in the design of efficient Monte Carlo algorithms. Nonparametric approximations of the posterior can be obtained using machine learning algorithms for nonparametric regression, such as Gaussian Processes and Nearest Neighbors. Then, they can serve as cheap surrogate models, or for building efficient proposal distributions. In Chapter 4, in the context of expensive posteriors, we propose adaptive quadratures of posterior expectations and the marginal likelihood using a sequential algorithm that builds and refines a nonparametric approximation of the posterior. In Chapter 5, we propose Regression-based Adaptive Deep Importance Sampling (RADIS), an adaptive IS algorithm that uses a nonparametric approximation of the posterior as the proposal distribution. We illustrate the proposed algorithms in applications of astronomy and remote sensing. Chapter 4 and 5 consider noiseless posterior evaluations for building the nonparametric approximations. More generally, in Chapter 6 we give an overview and classification of MCMC and IS schemes using surrogates built with noisy evaluations. The motivation here is the study of posteriors that are both costly and noisy. The classification reveals a connection between algorithms that use the posterior approximation as a cheap surrogate, and algorithms that use it for building an efficient proposal. We illustrate specific instances of the classified schemes in an application of reinforcement learning. Finally, in Chapter 7 we study noisy IS, namely, IS when the posterior evaluations are noisy, and derive optimal proposal distributions for the different estimators in this setting. Chapter 8 deals with (f). In Chapter 8, we provide with an exhaustive review of methods for marginal likelihood computation, with special focus on the ones based on Monte Carlo. We derive many connections among the methods and compare them in several simulations setups. Finally, in Chapter 9 we summarize the contributions of this thesis and discuss some potential avenues of future research.Programa de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Valero Laparra Pérez-Muelas.- Secretario: Michael Peter Wiper.- Vocal: Omer Deniz Akyildi

    Efficient XAI Techniques: A Taxonomic Survey

    Full text link
    Recently, there has been a growing demand for the deployment of Explainable Artificial Intelligence (XAI) algorithms in real-world applications. However, traditional XAI methods typically suffer from a high computational complexity problem, which discourages the deployment of real-time systems to meet the time-demanding requirements of real-world scenarios. Although many approaches have been proposed to improve the efficiency of XAI methods, a comprehensive understanding of the achievements and challenges is still needed. To this end, in this paper we provide a review of efficient XAI. Specifically, we categorize existing techniques of XAI acceleration into efficient non-amortized and efficient amortized methods. The efficient non-amortized methods focus on data-centric or model-centric acceleration upon each individual instance. In contrast, amortized methods focus on learning a unified distribution of model explanations, following the predictive, generative, or reinforcement frameworks, to rapidly derive multiple model explanations. We also analyze the limitations of an efficient XAI pipeline from the perspectives of the training phase, the deployment phase, and the use scenarios. Finally, we summarize the challenges of deploying XAI acceleration methods to real-world scenarios, overcoming the trade-off between faithfulness and efficiency, and the selection of different acceleration methods.Comment: 15 pages, 3 figure

    BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

    Full text link
    We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these parameter types and efficiently deliver high-quality code, BaCO uses Bayesian optimiza tion algorithms specialized towards the autotuning domain. We demonstrate BaCO's effectiveness on three modern compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. For these domains, BaCO outperforms current state-of-the-art autotuners by delivering on average 1.36x-1.56x faster code with a tiny search budget, and BaCO is able to reach expert-level performance 2.9x-3.9x faster

    Stochastic Methods for Fine-Grained Image Segmentation and Uncertainty Estimation in Computer Vision

    Get PDF
    In this dissertation, we exploit concepts of probability theory, stochastic methods and machine learning to address three existing limitations of deep learning-based models for image understanding. First, although convolutional neural networks (CNN) have substantially improved the state of the art in image understanding, conventional CNNs provide segmentation masks that poorly adhere to object boundaries, a critical limitation for many potential applications. Second, training deep learning models requires large amounts of carefully selected and annotated data, but large-scale annotation of image segmentation datasets is often prohibitively expensive. And third, conventional deep learning models also lack the capability of uncertainty estimation, which compromises both decision making and model interpretability. To address these limitations, we introduce the Region Growing Refinement (RGR) algorithm, an unsupervised post-processing algorithm that exploits Monte Carlo sampling and pixel similarities to propagate high-confidence labels into regions of low-confidence classification. The probabilistic Region Growing Refinement (pRGR) provides RGR with a rigorous mathematical foundation that exploits concepts of Bayesian estimation and variance reduction techniques. Experiments demonstrate both the effectiveness of (p)RGR for the refinement of segmentation predictions, as well as its suitability for uncertainty estimation, since its variance estimates obtained in the Monte Carlo iterations are highly correlated with segmentation accuracy. We also introduce FreeLabel, an intuitive open-source web interface that exploits RGR to allow users to obtain high-quality segmentation masks with just a few freehand scribbles, in a matter of seconds. Designed to benefit the computer vision community, FreeLabel can be used for both crowdsourced or private annotation and has a modular structure that can be easily adapted for any image dataset. The practical relevance of methods developed in this dissertation are illustrated through applications on agricultural and healthcare-related domains. We have combined RGR and modern CNNs for fine segmentation of fruit flowers, motivated by the importance of automated bloom intensity estimation for optimization of fruit orchard management and, possibly, automatizing procedures such as flower thinning and pollination. We also exploited an early version of FreeLabel to annotate novel datasets for segmentation of fruit flowers, which are currently publicly available. Finally, this dissertation also describes works on fine segmentation and gaze estimation for images collected from assisted living environments, with the ultimate goal of assisting geriatricians in evaluating health status of patients in such facilities

    Essays in Likelihood-Based Computational Econometrics

    Get PDF
    The theory of probabilities is basically only common sense reduced to a calculus. Pierre Simon Laplace, 1812 The quote above is from Pierre Simon Laplace’s introduction to his seminal work Th´eorie analytique des probabilit´es, in which he lays the groundwork for what is currently known as Bayesian analysis. He proceeds to describe probability theory, and statistical inference, as a method that makes one estimate accurately what right-minded people feel by a sort of instinct, often without being able to give a reason for it. (translation from French: Dale, 1995) This statement contains a profound truth and insight: Probability theory offers a clean and simple recipe for reasoning under uncertainty which I experienced as eyeopening when I first learned about it. As my knowledge of probability theory increased, however, I also realized that in isolation this quote presents things to be much simpler than they actually are: Reducing common sense to a calculus is extremely difficult to do well in practice. Translating our common sense into the language of probabilities takes a lot of practice, and if done accurately it often leads to a calculus without any exact solutions. It is therefore the task of statisticians and econometricians to find practical ways of reducing our common sense to calculus, and to devise smart new methods for efficiently doing the resulting calculations. This work represents my contribution towards these goals

    Dependence: From classical copula modeling to neural networks

    Get PDF
    The development of tools to measure and to model dependence in high-dimensional data is of great interest in a wide range of applications including finance, risk management, bioinformatics and environmental sciences. The copula framework, which allows us to extricate the underlying dependence structure of any multivariate distribution from its univariate marginals, has garnered growing popularity over the past few decades. Within the broader context of this framework, we develop several novel statistical methods and tools for analyzing, interpreting and modeling dependence. In the first half of this thesis, we advance classical copula modeling by introducing new dependence measures and parametric dependence models. To that end, we propose a framework for quantifying dependence between random vectors. Using the notion of a collapsing function, we summarize random vectors by single random variables, referred to as collapsed random variables. In the context of this collapsing function framework, we develop various tools to characterize the dependence between random vectors including new measures of association computed from the collapsed random variables, asymptotic results required to construct confidence intervals for these measures, collapsed copulas to analytically summarize the dependence for certain collapsing functions and a graphical assessment of independence between groups of random variables. We explore several suitable collapsing functions in theoretical and empirical settings. To showcase tools derived from our framework, we present data applications in bioinformatics and finance. Furthermore, we contribute to the growing literature on parametric copula modeling by generalizing the class of Archimax copulas (AXCs) to hierarchical Archimax copulas (HAXCs). AXCs are typically used to model the dependence at non-extreme levels while accounting for any asymptotic dependence between extremes. HAXCs then enhance the flexibility of AXCs by their ability to model partial asymmetries. We explore two ways of inducing hierarchies. Furthermore, we present various examples of HAXCs along with their stochastic representations, which are used to establish corresponding sampling algorithms. While the burgeoning research on the construction of parametric copulas has yielded some powerful tools for modeling dependence, the flexibility of these models is already limited in moderately high dimensions and they can often fail to adequately characterize complex dependence structures that arise in real datasets. In the second half of this thesis, we explore utilizing generative neural networks instead of parametric dependence models. In particular, we investigate the use of a type of generative neural network known as the generative moment matching network (GMMN) for two critical dependence modeling tasks. First, we demonstrate how GMMNs can be utilized to generate quasi-random samples from a large variety of multivariate distributions. These GMMN quasi-random samples can then be used to obtain low-variance estimates of quantities of interest. Compared to classical parametric copula methods for multivariate quasi-random sampling, GMMNs provide a more flexible and universal approach. Moreover, we theoretically and numerically corroborate the variance reduction capabilities of GMMN randomized quasi-Monte Carlo estimators. Second, we propose a GMMN--GARCH approach for modeling dependent multivariate time series, where ARMA--GARCH models are utilized to capture the temporal dependence within each univariate marginal time series and GMMNs are used to model the underlying cross-sectional dependence. If the number of marginal time series is large, we embed an intermediate dimension reduction step within our framework. The primary objective of our proposed approach is to produce empirical predictive distributions (EPDs), also known as probabilistic forecasts. In turn, these EPDs are also used to forecast certain risk measures, such as value-at-risk. Furthermore, in the context of modeling yield curves and foreign exchange rate returns, we show that the flexibility of our GMMN--GARCH models leads to better EPDs and risk-measure forecasts, compared to classical copula--GARCH models
    corecore