576 research outputs found
Bayesian Conditional Tensor Factorizations for High-Dimensional Classification
In many application areas, data are collected on a categorical response and
high-dimensional categorical predictors, with the goals being to build a
parsimonious model for classification while doing inferences on the important
predictors. In settings such as genomics, there can be complex interactions
among the predictors. By using a carefully-structured Tucker factorization, we
define a model that can characterize any conditional probability, while
facilitating variable selection and modeling of higher-order interactions.
Following a Bayesian approach, we propose a Markov chain Monte Carlo algorithm
for posterior computation accommodating uncertainty in the predictors to be
included. Under near sparsity assumptions, the posterior distribution for the
conditional probability is shown to achieve close to the parametric rate of
contraction even in ultra high-dimensional settings. The methods are
illustrated using simulation examples and biomedical applications
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
Bayesian dynamic financial networks with time-varying predictors
We propose a Bayesian nonparametric model including time-varying predictors
in dynamic network inference. The model is applied to infer the dependence
structure among financial markets during the global financial crisis,
estimating effects of verbal and material cooperation efforts. We interestingly
learn contagion effects, with increasing influence of verbal relations during
the financial crisis and opposite results during the United States housing
bubble
Bayesian Model Selection in Complex Linear Systems, as Illustrated in Genetic Association Studies
Motivated by examples from genetic association studies, this paper considers
the model selection problem in a general complex linear model system and in a
Bayesian framework. We discuss formulating model selection problems and
incorporating context-dependent {\it a priori} information through different
levels of prior specifications. We also derive analytic Bayes factors and their
approximations to facilitate model selection and discuss their theoretical and
computational properties. We demonstrate our Bayesian approach based on an
implemented Markov Chain Monte Carlo (MCMC) algorithm in simulations and a real
data application of mapping tissue-specific eQTLs. Our novel results on Bayes
factors provide a general framework to perform efficient model comparisons in
complex linear model systems
MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package
Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in closed form. Markov chain Monte Carlo methods solve this problem by sampling from a series of simpler conditional distributions that can be evaluated. The R package MCMCglmm implements such an algorithm for a range of model fitting problems. More than one response variable can be analyzed simultaneously, and these variables are allowed to follow Gaussian, Poisson, multi(bi)nominal, exponential, zero-inflated and censored distributions. A range of variance structures are permitted for the random effects, including interactions with categorical or continuous variables (i.e., random regression), and more complicated variance structures that arise through shared ancestry, either through a pedigree or through a phylogeny. Missing values are permitted in the response variable(s) and data can be known up to some level of measurement error as in meta-analysis. All simu- lation is done in C/ C++ using the CSparse library for sparse linear systems.
- …