23,847 research outputs found
Bayesian Compressed Regression
As an alternative to variable selection or shrinkage in high dimensional
regression, we propose to randomly compress the predictors prior to analysis.
This dramatically reduces storage and computational bottlenecks, performing
well when the predictors can be projected to a low dimensional linear subspace
with minimal loss of information about the response. As opposed to existing
Bayesian dimensionality reduction approaches, the exact posterior distribution
conditional on the compressed data is available analytically, speeding up
computation by many orders of magnitude while also bypassing robustness issues
due to convergence and mixing problems with MCMC. Model averaging is used to
reduce sensitivity to the random projection matrix, while accommodating
uncertainty in the subspace dimension. Strong theoretical support is provided
for the approach by showing near parametric convergence rates for the
predictive density in the large p small n asymptotic paradigm. Practical
performance relative to competitors is illustrated in simulations and real data
applications.Comment: 29 pages, 4 figure
Bayesian Approximate Kernel Regression with Variable Selection
Nonlinear kernel regression models are often used in statistics and machine
learning because they are more accurate than linear models. Variable selection
for kernel regression models is a challenge partly because, unlike the linear
regression setting, there is no clear concept of an effect size for regression
coefficients. In this paper, we propose a novel framework that provides an
effect size analog of each explanatory variable for Bayesian kernel regression
models when the kernel is shift-invariant --- for example, the Gaussian kernel.
We use function analytic properties of shift-invariant reproducing kernel
Hilbert spaces (RKHS) to define a linear vector space that: (i) captures
nonlinear structure, and (ii) can be projected onto the original explanatory
variables. The projection onto the original explanatory variables serves as an
analog of effect sizes. The specific function analytic property we use is that
shift-invariant kernel functions can be approximated via random Fourier bases.
Based on the random Fourier expansion we propose a computationally efficient
class of Bayesian approximate kernel regression (BAKR) models for both
nonlinear regression and binary classification for which one can compute an
analog of effect sizes. We illustrate the utility of BAKR by examining two
important problems in statistical genetics: genomic selection (i.e. phenotypic
prediction) and association mapping (i.e. inference of significant variants or
loci). State-of-the-art methods for genomic selection and association mapping
are based on kernel regression and linear models, respectively. BAKR is the
first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations
presented; references adde
Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies
This paper presents a unified treatment of Gaussian process models that
extends to data from the exponential dispersion family and to survival data.
Our specific interest is in the analysis of data sets with predictors that have
an a priori unknown form of possibly nonlinear associations to the response.
The modeling approach we describe incorporates Gaussian processes in a
generalized linear model framework to obtain a class of nonparametric
regression models where the covariance matrix depends on the predictors. We
consider, in particular, continuous, categorical and count responses. We also
look into models that account for survival outcomes. We explore alternative
covariance formulations for the Gaussian process prior and demonstrate the
flexibility of the construction. Next, we focus on the important problem of
selecting variables from the set of possible predictors and describe a general
framework that employs mixture priors. We compare alternative MCMC strategies
for posterior inference and achieve a computationally efficient and practical
approach. We demonstrate performances on simulated and benchmark data sets.Comment: Published in at http://dx.doi.org/10.1214/11-STS354 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
The Degrees of Freedom of Partial Least Squares Regression
The derivation of statistical properties for Partial Least Squares regression
can be a challenging task. The reason is that the construction of latent
components from the predictor variables also depends on the response variable.
While this typically leads to good performance and interpretable models in
practice, it makes the statistical analysis more involved. In this work, we
study the intrinsic complexity of Partial Least Squares Regression. Our
contribution is an unbiased estimate of its Degrees of Freedom. It is defined
as the trace of the first derivative of the fitted values, seen as a function
of the response. We establish two equivalent representations that rely on the
close connection of Partial Least Squares to matrix decompositions and Krylov
subspace techniques. We show that the Degrees of Freedom depend on the
collinearity of the predictor variables: The lower the collinearity is, the
higher the Degrees of Freedom are. In particular, they are typically higher
than the naive approach that defines the Degrees of Freedom as the number of
components. Further, we illustrate how the Degrees of Freedom approach can be
used for the comparison of different regression methods. In the experimental
section, we show that our Degrees of Freedom estimate in combination with
information criteria is useful for model selection.Comment: to appear in the Journal of the American Statistical Associatio
A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models
Inference for spatial generalized linear mixed models (SGLMMs) for
high-dimensional non-Gaussian spatial data is computationally intensive. The
computational challenge is due to the high-dimensional random effects and
because Markov chain Monte Carlo (MCMC) algorithms for these models tend to be
slow mixing. Moreover, spatial confounding inflates the variance of fixed
effect (regression coefficient) estimates. Our approach addresses both the
computational and confounding issues by replacing the high-dimensional spatial
random effects with a reduced-dimensional representation based on random
projections. Standard MCMC algorithms mix well and the reduced-dimensional
setting speeds up computations per iteration. We show, via simulated examples,
that Bayesian inference for this reduced-dimensional approach works well both
in terms of inference as well as prediction, our methods also compare favorably
to existing "reduced-rank" approaches. We also apply our methods to two real
world data examples, one on bird count data and the other classifying rock
types
Mixtures of g-priors in Generalized Linear Models
Mixtures of Zellner's g-priors have been studied extensively in linear models
and have been shown to have numerous desirable properties for Bayesian variable
selection and model averaging. Several extensions of g-priors to Generalized
Linear Models (GLMs) have been proposed in the literature; however, the choice
of prior distribution of g and resulting properties for inference have received
considerably less attention. In this paper, we unify mixtures of g-priors in
GLMs by assigning the truncated Compound Confluent Hypergeometric (tCCH)
distribution to 1/(1 + g), which encompasses as special cases several mixtures
of g-priors in the literature, such as the hyper-g, Beta-prime, truncated
Gamma, incomplete inverse-Gamma, benchmark, robust, hyper-g/n, and intrinsic
priors. Through an integrated Laplace approximation, the posterior distribution
of 1/(1 + g) is in turn a tCCH distribution, and approximate marginal
likelihoods are thus available analytically, leading to "Compound
Hypergeometric Information Criteria" for model selection. We discuss the local
geometric properties of the g-prior in GLMs and show how the desiderata for
model selection proposed by Bayarri et al, such as asymptotic model selection
consistency, intrinsic consistency, and measurement invariance may be used to
justify the prior and specific choices of the hyper parameters. We illustrate
inference using these priors and contrast them to other approaches via
simulation and real data examples. The methodology is implemented in the R
package BAS and freely available on CRAN
- …