45 research outputs found
A trust-region method for stochastic variational inference with applications to streaming data
Stochastic variational inference allows for fast posterior inference in
complex Bayesian models. However, the algorithm is prone to local optima which
can make the quality of the posterior approximation sensitive to the choice of
hyperparameters and initialization. We address this problem by replacing the
natural gradient step of stochastic varitional inference with a trust-region
update. We show that this leads to generally better results and reduced
sensitivity to hyperparameters. We also describe a new strategy for variational
inference on streaming data and show that here our trust-region method is
crucial for getting good performance.Comment: in Proceedings of the 32nd International Conference on Machine
Learning, 201
DIVA: A Dirichlet Process Based Incremental Deep Clustering Algorithm via Variational Auto-Encoder
Generative model-based deep clustering frameworks excel in classifying
complex data, but are limited in handling dynamic and complex features because
they require prior knowledge of the number of clusters. In this paper, we
propose a nonparametric deep clustering framework that employs an infinite
mixture of Gaussians as a prior. Our framework utilizes a memoized online
variational inference method that enables the "birth" and "merge" moves of
clusters, allowing our framework to cluster data in a "dynamic-adaptive"
manner, without requiring prior knowledge of the number of features. We name
the framework as DIVA, a Dirichlet Process-based Incremental deep clustering
framework via Variational Auto-Encoder. Our framework, which outperforms
state-of-the-art baselines, exhibits superior performance in classifying
complex data with dynamically changing features, particularly in the case of
incremental features. We released our source code implementation at:
https://github.com/Ghiara/divaComment: update supplementary material
Streaming, Distributed Variational Inference for Bayesian Nonparametrics
This paper presents a methodology for creating streaming, distributed
inference algorithms for Bayesian nonparametric (BNP) models. In the proposed
framework, processing nodes receive a sequence of data minibatches, compute a
variational posterior for each, and make asynchronous streaming updates to a
central model. In contrast to previous algorithms, the proposed framework is
truly streaming, distributed, asynchronous, learning-rate-free, and
truncation-free. The key challenge in developing the framework, arising from
the fact that BNP models do not impose an inherent ordering on their
components, is finding the correspondence between minibatch and central BNP
posterior components before performing each update. To address this, the paper
develops a combinatorial optimization problem over component correspondences,
and provides an efficient solution technique. The paper concludes with an
application of the methodology to the DP mixture model, with experimental
results demonstrating its practical scalability and performance.Comment: This paper was presented at NIPS 2015. Please use the following
BibTeX citation: @inproceedings{Campbell15_NIPS, Author = {Trevor Campbell
and Julian Straub and John W. {Fisher III} and Jonathan P. How}, Title =
{Streaming, Distributed Variational Inference for Bayesian Nonparametrics},
Booktitle = {Advances in Neural Information Processing Systems (NIPS)}, Year
= {2015}
Demand for Medical Care by the Elderly: A Nonparametric Variational Bayesian Mixture Approach
Outpatient care is a large share of total health care spending, making analysis of data on outpatient utilization an important part of understanding patterns and drivers of health care spending growth. Common features of outpatient utilization measures include zero-inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Mixture modeling is a popular approach because it can accommodate these features of health care utilization data. In this work, we add a nonparametric clustering component to such models. Our fully Bayesian model framework allows for an unknown number of mixing components, so that the data, rather than the researcher, determine the number of mixture components. We apply the modeling framework to data on visits to physicians by elderly individuals and show that each subgroup has different characteristics that allow easy interpretation and new insights
MULTIVARIATE MODELING OF COGNITIVE PERFORMANCE AND CATEGORICAL PERCEPTION FROM NEUROIMAGING DATA
State-of-the-art cognitive-neuroscience mainly uses hypothesis-driven statistical testing to characterize and model neural disorders and diseases. While such techniques have proven to be powerful in understanding diseases and disorders, they are inadequate in explaining causal relationships as well as individuality and variations. In this study, we proposed multivariate data-driven approaches for predictive modeling of cognitive events and disorders. We developed network descriptions of both structural and functional connectivities that are critical in multivariate modeling of cognitive performance (i.e., fluency, attention, and working memory) and categorical perceptions (i.e., emotion, speech perception). We also performed dynamic network analysis on brain connectivity measures to determine the role of different functional areas in relation to categorical perceptions and cognitive events. Our empirical studies of structural connectivity were performed using Diffusion Tensor Imaging (DTI). The main objective was to discover the role of structural connectivity in selecting clinically interpretable features that are consistent over a large range of model parameters in classifying cognitive performances in relation to Acute Lymphoblastic Leukemia (ALL). The proposed approach substantially improved accuracy (13% - 26%) over existing models and also selected a relevant, small subset of features that were verified by domain experts. In summary, the proposed approach produced interpretable models with better generalization.Functional connectivity is related to similar patterns of activation in different brain regions regardless of the apparent physical connectedness of the regions. The proposed data-driven approach to the source localized electroencephalogram (EEG) data includes an array of tools such as graph mining, feature selection, and multivariate analysis to determine the functional connectivity in categorical perceptions. We used the network description to correctly classify listeners behavioral responses with an accuracy over 92% on 35 participants. State-of-the-art network description of human brain assumes static connectivities. However, brain networks in relation to perception and cognition are complex and dynamic. Analysis of transient functional networks with spatiotemporal variations to understand cognitive functions remains challenging. One of the critical missing links is the lack of sophisticated methodologies in understanding dynamics neural activity patterns. We proposed a clustering-based complex dynamic network analysis on source localized EEG data to understand the commonality and differences in gender-specific emotion processing. Besides, we also adopted Bayesian nonparametric framework for segmentation neural activity with a finite number of microstates. This approach enabled us to find the default network and transient pattern of the underlying neural mechanism in relation to categorical perception. In summary, multivariate and dynamic network analysis methods developed in this dissertation to analyze structural and functional connectivities will have a far-reaching impact on computational neuroscience to identify meaningful changes in spatiotemporal brain activities
Simple approximate MAP inference for Dirichlet processes mixtures
The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics