2,531 research outputs found
Bayesian learning of joint distributions of objects
There is increasing interest in broad application areas in defining flexible
joint models for data having a variety of measurement scales, while also
allowing data of complex types, such as functions, images and documents. We
consider a general framework for nonparametric Bayes joint modeling through
mixture models that incorporate dependence across data types through a joint
mixing measure. The mixing measure is assigned a novel infinite tensor
factorization (ITF) prior that allows flexible dependence in cluster allocation
across data types. The ITF prior is formulated as a tensor product of
stick-breaking processes. Focusing on a convenient special case corresponding
to a Parafac factorization, we provide basic theory justifying the flexibility
of the proposed prior and resulting asymptotic properties. Focusing on ITF
mixtures of product kernels, we develop a new Gibbs sampling algorithm for
routine implementation relying on slice sampling. The methods are compared with
alternative joint mixture models based on Dirichlet processes and related
approaches through simulations and real data applications.Comment: Appearing in Proceedings of the 16th International Conference on
Artificial Intelligence and Statistics (AISTATS) 2013, Scottsdale, AZ, US
ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly
Matrix completion and approximation are popular tools to capture a user's
preferences for recommendation and to approximate missing data. Instead of
using low-rank factorization we take a drastically different approach, based on
the simple insight that an additive model of co-clusterings allows one to
approximate matrices efficiently. This allows us to build a concise model that,
per bit of model learned, significantly beats all factorization approaches to
matrix approximation. Even more surprisingly, we find that summing over small
co-clusterings is more effective in modeling matrices than classic
co-clustering, which uses just one large partitioning of the matrix.
Following Occam's razor principle suggests that the simple structure induced
by our model better captures the latent preferences and decision making
processes present in the real world than classic co-clustering or matrix
factorization. We provide an iterative minimization algorithm, a collapsed
Gibbs sampler, theoretical guarantees for matrix approximation, and excellent
empirical evidence for the efficacy of our approach. We achieve
state-of-the-art results on the Netflix problem with a fraction of the model
complexity.Comment: 22 pages, under review for conference publicatio
A new BART prior for flexible modeling with categorical predictors
Default implementations of Bayesian Additive Regression Trees (BART)
represent categorical predictors using several binary indicators, one for each
level of each categorical predictor. Regression trees built with these
indicators partition the levels using a ``remove one a time strategy.''
Unfortunately, the vast majority of partitions of the levels cannot be built
with this strategy, severely limiting BART's ability to ``borrow strength''
across groups of levels. We overcome this limitation with a new class of
regression tree and a new decision rule prior that can assign multiple levels
to both the left and right child of a decision node. Motivated by spatial
applications with areal data, we introduce a further decision rule prior that
partitions the areas into spatially contiguous regions by deleting edges from
random spanning trees of a suitably defined network. We implemented our new
regression tree priors in the flexBART package, which, compared to existing
implementations, often yields improved out-of-sample predictive performance
without much additional computational burden. We demonstrate the efficacy of
flexBART using examples from baseball and the spatiotemporal modeling of crime.Comment: Software available at https://github.com/skdeshpande91/flexBAR
Multivariate Spatiotemporal Hawkes Processes and Network Reconstruction
There is often latent network structure in spatial and temporal data and the
tools of network analysis can yield fascinating insights into such data. In
this paper, we develop a nonparametric method for network reconstruction from
spatiotemporal data sets using multivariate Hawkes processes. In contrast to
prior work on network reconstruction with point-process models, which has often
focused on exclusively temporal information, our approach uses both temporal
and spatial information and does not assume a specific parametric form of
network dynamics. This leads to an effective way of recovering an underlying
network. We illustrate our approach using both synthetic networks and networks
constructed from real-world data sets (a location-based social media network, a
narrative of crime events, and violent gang crimes). Our results demonstrate
that, in comparison to using only temporal data, our spatiotemporal approach
yields improved network reconstruction, providing a basis for meaningful
subsequent analysis --- such as community structure and motif analysis --- of
the reconstructed networks
- …