7 research outputs found

    Machine Learning As Tool And Theory For Computational Neuroscience

    Get PDF
    Computational neuroscience is in the midst of constructing a new framework for understanding the brain based on the ideas and methods of machine learning. This is effort has been encouraged, in part, by recent advances in neural network models. It is also driven by a recognition of the complexity of neural computation and the challenges that this poses for neuroscience’s methods. In this dissertation, I first work to describe these problems of complexity that have prompted a shift in focus. In particular, I develop machine learning tools for neurophysiology that help test whether tuning curves and other statistical models in fact capture the meaning of neural activity. Then, taking up a machine learning framework for understanding, I consider theories about how neural computation emerges from experience. Specifically, I develop hypotheses about the potential learning objectives of sensory plasticity, the potential learning algorithms in the brain, and finally the consequences for sensory representations of learning with such algorithms. These hypotheses pull from advances in several areas of machine learning, including optimization, representation learning, and deep learning theory. Each of these subfields has insights for neuroscience, offering up links for a chain of knowledge about how we learn and think. Together, this dissertation helps to further an understanding of the brain in the lens of machine learning

    Curve sampling and geometric conditional simulation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 195-203).The main objective of this thesis is the development and exploitation of techniques to generate geometric samples for the purpose of image segmentation. A sampling-based approach provides a number of benefits over existing optimization-based methods such as robustness to noise and model error, characterization of segmentation uncertainty, natural handling of multi-modal distributions, and incorporation of partial segmentation information. This is important for applications which suffer from, e.g., low signal-to-noise ratio (SNR) or ill-posedness. We create a curve sampling algorithm using the Metropolis-Hastings Markov chain Monte Carlo (MCMC) framework. With this method, samples from a target distribution [pi] (which can be evaluated but not sampled from directly) are generated by creating a Markov chain whose stationary distribution is [pi] and sampling many times from a proposal distribution q. We define a proposal distribution using random Gaussian curve perturbations, and show how to ensure detailed balance and ergodicity of the chain so that iterates of the Markov chain asymptotically converge to samples from [pi]. We visualize the resulting samples using techniques such as confidence bounds and principal modes of variation and demonstrate the algorithm on examples such as prostate magnetic resonance (MR) images, brain MR images, and geological structure estimation using surface gravity observations. We generalize our basic curve sampling framework to perform conditional simulation: a portion of the solution space is specified, and the remainder is sampled conditioned on that information. For difficult segmentation problems which are currently done manually by human experts, reliable semi-automatic segmentation approaches can significantly reduce the amount of time and effort expended on a problem. We also extend our framework to 3D by creating a hybrid 2D/3D Markov chain surface model.For this approach, the nodes on the chain represent entire curves on parallel planes,and the slices combine to form a complete surface. Interaction among the curves is described by an undirected Markov chain, and we describe methods to sample from this model using both local Metropolis-Hastings methods and the embedded hidden Markov model (HMM) algorithm.by Ayres C. Fan.Ph.D

    Neural density estimation and likelihood-free inference

    Get PDF
    I consider two problems in machine learning and statistics: the problem of estimating the joint probability density of a collection of random variables, known as density estimation, and the problem of inferring model parameters when their likelihood is intractable, known as likelihood-free inference. The contribution of the thesis is a set of new methods for addressing these problems that are based on recent advances in neural networks and deep learning. The first part of the thesis is about density estimation. The joint probability density of a collection of random variables is a useful mathematical description of their statistical properties, but can be hard to estimate from data, especially when the number of random variables is large. Traditional density-estimation methods such as histograms or kernel density estimators are effective for a small number of random variables, but scale badly as the number increases. In contrast, models for density estimation based on neural networks scale better with the number of random variables, and can incorporate domain knowledge in their design. My main contribution is Masked Autoregressive Flow, a new model for density estimation based on a bijective neural network that transforms random noise to data. At the time of its introduction, Masked Autoregressive Flow achieved state-of-the-art results in general-purpose density estimation. Since its publication, Masked Autoregressive Flow has contributed to the broader understanding of neural density estimation, and has influenced subsequent developments in the field. The second part of the thesis is about likelihood-free inference. Typically, a statistical model can be specified either as a likelihood function that describes the statistical relationship between model parameters and data, or as a simulator that can be run forward to generate data. Specifying a statistical model as a simulator can offer greater modelling flexibility and can produce more interpretable models, but can also make inference of model parameters harder, as the likelihood of the parameters may no longer be tractable. Traditional techniques for likelihood-free inference such as approximate Bayesian computation rely on simulating data from the model, but often require a large number of simulations to produce accurate results. In this thesis, I cast the problem of likelihood-free inference as a density-estimation problem, and address it with neural density models. My main contribution is the introduction of two new methods for likelihood-free inference: Sequential Neural Posterior Estimation (Type A), which estimates the posterior, and Sequential Neural Likelihood, which estimates the likelihood. Both methods use a neural density model to estimate the posterior/likelihood, and a sequential training procedure to guide simulations. My experiments show that the proposed methods produce accurate results, and are often orders of magnitude faster than alternative methods based on approximate Bayesian computation

    Robust Learning from Multiple Information Sources

    Full text link
    In the big data era, the ability to handle high-volume, high-velocity and high-variety information assets has become a basic requirement for data analysts. Traditional learning models, which focus on medium size, single source data, often fail to achieve reliable performance if data come from multiple heterogeneous sources (views). As a result, robust multi-view data processing methods that are insensitive to corruptions and anomalies in the data set are needed. This thesis develops robust learning methods for three problems that arise from real-world applications: robust training on a noisy training set, multi-view learning in the presence of between-view inconsistency and network topology inference using partially observed data. The central theme behind all these methods is the use of information-theoretic measures, including entropies and information divergences, as parsimonious representations of uncertainties in the data, as robust optimization surrogates that allows for efficient learning, and as flexible and reliable discrepancy measures for data fusion. More specifically, the thesis makes the following contributions: 1. We propose a maximum entropy-based discriminative learning model that incorporates the minimal entropy (ME) set anomaly detection technique. The resulting probabilistic model can perform both nonparametric classification and anomaly detection simultaneously. An efficient algorithm is then introduced to estimate the posterior distribution of the model parameters while selecting anomalies in the training data. 2. We consider a multi-view classification problem on a statistical manifold where class labels are provided by probabilistic density functions (p.d.f.) and may not be consistent among different views due to the existence of noise corruption. A stochastic consensus-based multi-view learning model is proposed to fuse predictive information for multiple views together. By exploring the non-Euclidean structure of the statistical manifold, a joint consensus view is constructed that is robust to single-view noise corruption and between-view inconsistency. 3. We present a method for estimating the parameters (partial correlations) of a Gaussian graphical model that learns a sparse sub-network topology from partially observed relational data. This model is applicable to the situation where the partial correlations between pairs of variables on a measured sub-network (internal data) are to be estimated when only summary information about the partial correlations between variables outside of the sub-network (external data) are available. The proposed model is able to incorporate the dependence structure between latent variables from external sources and perform latent feature selection efficiently. From a multi-view learning perspective, it can be seen as a two-view learning system given asymmetric information flow from both the internal view and the external view.PHDElectrical & Computer Eng PhDUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138599/1/tianpei_1.pd

    Exploring Practical Methodologies for the Characterization and Control of Small Quantum Systems

    Get PDF
    We explore methodologies for characterizing and controlling small quantum systems. We are interested in starting with a description of a quantum system, designing estimators for parameters of the system, developing robust and high-fidelity gates for the system using knowledge of these parameters, and experimentally verifying the performance of these gates. A strong emphasis is placed on using rigorous statistical methods, especially Bayesian ones, to analyze quantum system data. Throughout this thesis, the Nitrogen Vacancy system is used as an experimental testbed. Characterization of system parameters is done using quantum Hamiltonian learning, where we explore the use of adaptive experiment design to speed up learning rates. Gates for the full three-level system are designed with numerical optimal control methods that take into account imperfections of the control hardware. Gate quality is assessed using randomized benchmarking protocols, including standard randomized benchmarking, unitarity benchmarking, and leakage/loss benchmarking
    corecore