52,734 research outputs found
Bayesian nonparametric sparse VAR models
High dimensional vector autoregressive (VAR) models require a large number of
parameters to be estimated and may suffer of inferential problems. We propose a
new Bayesian nonparametric (BNP) Lasso prior (BNP-Lasso) for high-dimensional
VAR models that can improve estimation efficiency and prediction accuracy. Our
hierarchical prior overcomes overparametrization and overfitting issues by
clustering the VAR coefficients into groups and by shrinking the coefficients
of each group toward a common location. Clustering and shrinking effects
induced by the BNP-Lasso prior are well suited for the extraction of causal
networks from time series, since they account for some stylized facts in
real-world networks, which are sparsity, communities structures and
heterogeneity in the edges intensity. In order to fully capture the richness of
the data and to achieve a better understanding of financial and macroeconomic
risk, it is therefore crucial that the model used to extract network accounts
for these stylized facts.Comment: Forthcoming in "Journal of Econometrics" ---- Revised Version of the
paper "Bayesian nonparametric Seemingly Unrelated Regression Models" ----
Supplementary Material available on reques
Hierarchical Implicit Models and Likelihood-Free Variational Inference
Implicit probabilistic models are a flexible class of models defined by a
simulation process for data. They form the basis for theories which encompass
our understanding of the physical world. Despite this fundamental nature, the
use of implicit models remains limited due to challenges in specifying complex
latent structure in them, and in performing inferences in such models with
large data sets. In this paper, we first introduce hierarchical implicit models
(HIMs). HIMs combine the idea of implicit densities with hierarchical Bayesian
modeling, thereby defining models via simulators of data with rich hidden
structure. Next, we develop likelihood-free variational inference (LFVI), a
scalable variational inference algorithm for HIMs. Key to LFVI is specifying a
variational family that is also implicit. This matches the model's flexibility
and allows for accurate approximation of the posterior. We demonstrate diverse
applications: a large-scale physical simulator for predator-prey populations in
ecology; a Bayesian generative adversarial network for discrete data; and a
deep implicit model for text generation.Comment: Appears in Neural Information Processing Systems, 201
Joint estimation of multiple related biological networks
Graphical models are widely used to make inferences concerning interplay in
multivariate systems. In many applications, data are collected from multiple
related but nonidentical units whose underlying networks may differ but are
likely to share features. Here we present a hierarchical Bayesian formulation
for joint estimation of multiple networks in this nonidentically distributed
setting. The approach is general: given a suitable class of graphical models,
it uses an exchangeability assumption on networks to provide a corresponding
joint formulation. Motivated by emerging experimental designs in molecular
biology, we focus on time-course data with interventions, using dynamic
Bayesian networks as the graphical models. We introduce a computationally
efficient, deterministic algorithm for exact joint inference in this setting.
We provide an upper bound on the gains that joint estimation offers relative to
separate estimation for each network and empirical results that support and
extend the theory, including an extensive simulation study and an application
to proteomic data from human cancer cell lines. Finally, we describe
approximations that are still more computationally efficient than the exact
algorithm and that also demonstrate good empirical performance.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS761 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Probabilistic Latent Tensor Factorization Model for Link Pattern Prediction in Multi-relational Networks
This paper aims at the problem of link pattern prediction in collections of
objects connected by multiple relation types, where each type may play a
distinct role. While common link analysis models are limited to single-type
link prediction, we attempt here to capture the correlations among different
relation types and reveal the impact of various relation types on performance
quality. For that, we define the overall relations between object pairs as a
\textit{link pattern} which consists in interaction pattern and connection
structure in the network, and then use tensor formalization to jointly model
and predict the link patterns, which we refer to as \textit{Link Pattern
Prediction} (LPP) problem. To address the issue, we propose a Probabilistic
Latent Tensor Factorization (PLTF) model by introducing another latent factor
for multiple relation types and furnish the Hierarchical Bayesian treatment of
the proposed probabilistic model to avoid overfitting for solving the LPP
problem. To learn the proposed model we develop an efficient Markov Chain Monte
Carlo sampling method. Extensive experiments are conducted on several real
world datasets and demonstrate significant improvements over several existing
state-of-the-art methods.Comment: 19pages, 5 figure
Nonlinear Models Using Dirichlet Process Mixtures
We introduce a new nonlinear model for classification, in which we model the
joint distribution of response variable, y, and covariates, x,
non-parametrically using Dirichlet process mixtures. We keep the relationship
between y and x linear within each component of the mixture. The overall
relationship becomes nonlinear if the mixture contains more than one component.
We use simulated data to compare the performance of this new approach to a
simple multinomial logit (MNL) model, an MNL model with quadratic terms, and a
decision tree model. We also evaluate our approach on a protein fold
classification problem, and find that our model provides substantial
improvement over previous methods, which were based on Neural Networks (NN) and
Support Vector Machines (SVM). Folding classes of protein have a hierarchical
structure. We extend our method to classification problems where a class
hierarchy is available. We find that using the prior information regarding the
hierarchical structure of protein folds can result in higher predictive
accuracy
Hierarchical relational models for document networks
We develop the relational topic model (RTM), a hierarchical model of both
network structure and node attributes. We focus on document networks, where the
attributes of each document are its words, that is, discrete observations taken
from a fixed vocabulary. For each pair of documents, the RTM models their link
as a binary random variable that is conditioned on their contents. The model
can be used to summarize a network of documents, predict links between them,
and predict words within them. We derive efficient inference and estimation
algorithms based on variational methods that take advantage of sparsity and
scale with the number of links. We evaluate the predictive performance of the
RTM for large networks of scientific abstracts, web documents, and
geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …