338 research outputs found

    Contributions to MCMC Methods in Constrained Domains with Applications to Neuroimaging

    Full text link
    Markov chain Monte Carlo (MCMC) methods form a rich class of computational techniques that help its user ascertain samples from target distributions when direct sampling is not possible or when their closed forms are intractable. Over the years, MCMC methods have been used in innumerable situations due to their flexibility and generalizability, even in situations involving nonlinear and/or highly parametrized models. In this dissertation, two major works relating to MCMC methods are presented. The first involves the development of a method to identify the number and directions of nerve fibers using diffusion-weighted MRI measurements. For this, the biological problem is first formulated as a model selection and estimation problem. Using the framework of reversible jump MCMC, a novel Bayesian scheme that performs both the above tasks simultaneously using customizable priors and proposal distributions is proposed. The proposed method allows users to set a prior level of spatial separation between the nerve fibers, allowing more crossing paths to be detected when desired or a lower number to potentially only detect robust nerve tracts. Hence, estimation that is specific to a given region of interest within the brain can be performed. In simulated examples, the method has been shown to resolve up to four fibers even in instances of highly noisy data. Comparative analysis with other state-of-the-art methods on in-vivo data showed the method\u27s ability to detect more crossing nerve fibers. The second work involves the construction of an MCMC algorithm that efficiently performs (Bayesian) sampling of parameters with support constraints. The method works by embedding a transformation called inversion in a sphere within the Metropolis-Hastings sampler. This creates an image of the constrained support that is amenable to sampling using standard proposals such as Gaussian. The proposed strategy is tested on three domains: the standard simplex, a sector of an n-sphere, and hypercubes. In each domain, a comparison is made with existing sampling techniques

    Advances in scalable learning and sampling of unnormalised models

    Get PDF
    We study probabilistic models that are known incompletely, up to an intractable normalising constant. To reap the full benefit of such models, two tasks must be solved: learning and sampling. These two tasks have been subject to decades of research, and yet significant challenges still persist. Traditional approaches often suffer from poor scalability with respect to dimensionality and model-complexity, generally rendering them inapplicable to models parameterised by deep neural networks. In this thesis, we contribute a new set of methods for addressing this scalability problem. We first explore the problem of learning unnormalised models. Our investigation begins with a well-known learning principle, Noise-contrastive Estimation, whose underlying mechanism is that of density-ratio estimation. By examining why existing density-ratio estimators scale poorly, we identify a new framework, telescoping density-ratio estimation (TRE), that can learn ratios between highly dissimilar densities in high-dimensional spaces. Our experiments demonstrate that TRE not only yields substantial improvements for the learning of deep unnormalised models, but can do the same for a broader set of tasks including mutual information estimation and representation learning. Subsequently, we explore the problem of sampling unnormalised models. A large literature on Markov chain Monte Carlo (MCMC) can be leveraged here, and in continuous domains, gradient-based samplers such as Metropolis-adjusted Langevin algorithm (MALA) and Hamiltonian Monte Carlo are excellent options. However, there has been substantially less progress in MCMC for discrete domains. To advance this subfield, we introduce several discrete Metropolis-Hastings samplers that are conceptually inspired by MALA, and demonstrate their strong empirical performance across a range of challenging sampling tasks

    What does explainable AI explain?

    Get PDF
    Machine Learning (ML) models are increasingly used in industry, as well as in scientific research and social contexts. Unfortunately, ML models provide only partial solutions to real-world problems, focusing on predictive performance in static environments. Problem aspects beyond prediction, such as robustness in employment, knowledge generation in science, or providing recourse recommendations to end-users, cannot be directly tackled with ML models. Explainable Artificial Intelligence (XAI) aims to solve, or at least highlight, problem aspects beyond predictive performance through explanations. However, the field is still in its infancy, as fundamental questions such as “What are explanations?”, “What constitutes a good explanation?”, or “How relate explanation and understanding?” remain open. In this dissertation, I combine philosophical conceptual analysis and mathematical formalization to clarify a prerequisite of these difficult questions, namely what XAI explains: I point out that XAI explanations are either associative or causal and either aim to explain the ML model or the modeled phenomenon. The thesis is a collection of five individual research papers that all aim to clarify how different problems in XAI are related to these different “whats”. In Paper I, my co-authors and I illustrate how to construct XAI methods for inferring associational phenomenon relationships. Paper II directly relates to the first; we formally show how to quantify uncertainty of such scientific inferences for two XAI methods – partial dependence plots (PDP) and permutation feature importance (PFI). Paper III discusses the relationship between counterfactual explanations and adversarial examples; I argue that adversarial examples can be described as counterfactual explanations that alter the prediction but not the underlying target variable. In Paper IV, my co-authors and I argue that algorithmic recourse recommendations should help data-subjects improve their qualification rather than to game the predictor. In Paper V, we address general problems with model agnostic XAI methods and identify possible solutions

    Hierarchische Modelle fĂŒr das visuelle Erkennen und Lernen von Objekten, Szenen und AktivitĂ€ten

    Get PDF
    In many computer vision applications, objects have to be learned and recognized in images or image sequences. Most of these objects have a hierarchical structure.For example, 3d objects can be decomposed into object parts, and object parts, in turn, into geometric primitives. Furthermore, scenes are composed of objects. And also activities or behaviors can be divided hierarchically into actions, these into individual movements, etc. Hierarchical models are therefore ideally suited for the representation of a wide range of objects used in applications such as object recognition, human pose estimation, or activity recognition. In this work new probabilistic hierarchical models are presented that allow an efficient representation of multiple objects of different categories, scales, rotations, and views. The idea is to exploit similarities between objects, object parts or actions and movements in order to share calculations and avoid redundant information. We will introduce online and offline learning methods, which enable to create efficient hierarchies based on small or large training datasets, in which poses or articulated structures are given by instances. Furthermore, we present inference approaches for fast and robust detection. These new approaches combine the idea of compositional and similarity hierarchies and overcome limitations of previous methods. They will be used in an unified hierarchical framework spatially for object recognition as well as spatiotemporally for activity recognition. The unified generic hierarchical framework allows us to apply the proposed models in different projects. Besides classical object recognition it is used for detection of human poses in a project for gait analysis. The activity detection is used in a project for the design of environments for ageing, to identify activities and behavior patterns in smart homes. In a project for parking spot detection using an intelligent vehicle, the proposed approaches are used to hierarchically model the environment of the vehicle for an efficient and robust interpretation of the scene in real-time.In zahlreichen Computer Vision Anwendungen mĂŒssen Objekte in einzelnen Bildern oder Bildsequenzen erlernt und erkannt werden. Viele dieser Objekte sind hierarchisch aufgebaut.So lassen sich 3d Objekte in Objektteile zerlegen und Objektteile wiederum in geometrische Grundkörper. Und auch AktivitĂ€ten oder Verhaltensmuster lassen sich hierarchisch in einzelne Aktionen aufteilen, diese wiederum in einzelne Bewegungen usw. FĂŒr die ReprĂ€sentation sind hierarchische Modelle dementsprechend gut geeignet. In dieser Arbeit werden neue probabilistische hierarchische Modelle vorgestellt, die es ermöglichen auch mehrere Objekte verschiedener Kategorien, Skalierungen, Rotationen und aus verschiedenen Blickrichtungen effizient zu reprĂ€sentieren. Eine Idee ist hierbei, Ähnlichkeiten unter Objekten, Objektteilen oder auch Aktionen und Bewegungen zu nutzen, um redundante Informationen und Mehrfachberechnungen zu vermeiden. In der Arbeit werden online und offline Lernverfahren vorgestellt, die es ermöglichen, effiziente Hierarchien auf Basis von kleinen oder großen TrainingsdatensĂ€tzen zu erstellen, in denen Posen und bewegliche Strukturen durch Beispiele gegeben sind. Des Weiteren werden InferenzansĂ€tze zur schnellen und robusten Detektion vorgestellt. Diese werden innerhalb eines einheitlichen hierarchischen Frameworks sowohl rĂ€umlich zur Objekterkennung als auch raumzeitlich zur AktivitĂ€tenerkennung verwendet. Das einheitliche Framework ermöglicht die Anwendung des vorgestellten Modells innerhalb verschiedener Projekte. Neben der klassischen Objekterkennung wird es zur Erkennung von menschlichen Posen in einem Projekt zur Ganganalyse verwendet. Die AktivitĂ€tenerkennung wird in einem Projekt zur Gestaltung altersgerechter Lebenswelten genutzt, um in intelligenten WohnrĂ€umen AktivitĂ€ten und Verhaltensmuster von Bewohnern zu erkennen. Im Rahmen eines Projektes zur ParklĂŒckenvermessung mithilfe eines intelligenten Fahrzeuges werden die vorgestellten AnsĂ€tze verwendet, um das Umfeld des Fahrzeuges hierarchisch zu modellieren und dadurch das Szenenverstehen zu ermöglichen
    • 

    corecore