99,430 research outputs found
A Unified Framework of Constrained Regression
Generalized additive models (GAMs) play an important role in modeling and
understanding complex relationships in modern applied statistics. They allow
for flexible, data-driven estimation of covariate effects. Yet researchers
often have a priori knowledge of certain effects, which might be monotonic or
periodic (cyclic) or should fulfill boundary conditions. We propose a unified
framework to incorporate these constraints for both univariate and bivariate
effect estimates and for varying coefficients. As the framework is based on
component-wise boosting methods, variables can be selected intrinsically, and
effects can be estimated for a wide range of different distributional
assumptions. Bootstrap confidence intervals for the effect estimates are
derived to assess the models. We present three case studies from environmental
sciences to illustrate the proposed seamless modeling framework. All discussed
constrained effect estimates are implemented in the comprehensive R package
mboost for model-based boosting.Comment: This is a preliminary version of the manuscript. The final
publication is available at
http://link.springer.com/article/10.1007/s11222-014-9520-
Dealing with Label Switching in Mixture Models Under Genuine Multimodality
The fitting of finite mixture models is an ill-defined estimation problem as completely different parameterizations can induce similar mixture distributions. This leads to multiple modes in the likelihood which is a problem for frequentist maximum likelihood estimation, and complicates statistical inference of Markov chain Monte Carlo draws in Bayesian estimation. For the analysis of the posterior density of these draws a suitable separation into different modes is desirable. In addition, a unique labelling of the component specific estimates is necessary to solve the label
switching problem. This paper presents and compares two approaches to achieve these goals: relabelling under multimodality and constrained clustering. The algorithmic details are discussed and their application is demonstrated on artificial and real-world data
Sequential Monte Carlo EM for multivariate probit models
Multivariate probit models (MPM) have the appealing feature of capturing some
of the dependence structure between the components of multidimensional binary
responses. The key for the dependence modelling is the covariance matrix of an
underlying latent multivariate Gaussian. Most approaches to MLE in multivariate
probit regression rely on MCEM algorithms to avoid computationally intensive
evaluations of multivariate normal orthant probabilities. As an alternative to
the much used Gibbs sampler a new SMC sampler for truncated multivariate
normals is proposed. The algorithm proceeds in two stages where samples are
first drawn from truncated multivariate Student distributions and then
further evolved towards a Gaussian. The sampler is then embedded in a MCEM
algorithm. The sequential nature of SMC methods can be exploited to design a
fully sequential version of the EM, where the samples are simply updated from
one iteration to the next rather than resampled from scratch. Recycling the
samples in this manner significantly reduces the computational cost. An
alternative view of the standard conditional maximisation step provides the
basis for an iterative procedure to fully perform the maximisation needed in
the EM algorithm. The identifiability of MPM is also thoroughly discussed. In
particular, the likelihood invariance can be embedded in the EM algorithm to
ensure that constrained and unconstrained maximisation are equivalent. A simple
iterative procedure is then derived for either maximisation which takes
effectively no computational time. The method is validated by applying it to
the widely analysed Six Cities dataset and on a higher dimensional simulated
example. Previous approaches to the Six Cities overly restrict the parameter
space but, by considering the correct invariance, the maximum likelihood is
quite naturally improved when treating the full unrestricted model.Comment: 26 pages, 2 figures. In press, Computational Statistics & Data
Analysi
A General Framework for Fast Stagewise Algorithms
Forward stagewise regression follows a very simple strategy for constructing
a sequence of sparse regression estimates: it starts with all coefficients
equal to zero, and iteratively updates the coefficient (by a small amount
) of the variable that achieves the maximal absolute inner product
with the current residual. This procedure has an interesting connection to the
lasso: under some conditions, it is known that the sequence of forward
stagewise estimates exactly coincides with the lasso path, as the step size
goes to zero. Furthermore, essentially the same equivalence holds
outside of least squares regression, with the minimization of a differentiable
convex loss function subject to an norm constraint (the stagewise
algorithm now updates the coefficient corresponding to the maximal absolute
component of the gradient).
Even when they do not match their -constrained analogues, stagewise
estimates provide a useful approximation, and are computationally appealing.
Their success in sparse modeling motivates the question: can a simple,
effective strategy like forward stagewise be applied more broadly in other
regularization settings, beyond the norm and sparsity? The current
paper is an attempt to do just this. We present a general framework for
stagewise estimation, which yields fast algorithms for problems such as
group-structured learning, matrix completion, image denoising, and more.Comment: 56 pages, 15 figure
Distributed Regression in Sensor Networks: Training Distributively with Alternating Projections
Wireless sensor networks (WSNs) have attracted considerable attention in
recent years and motivate a host of new challenges for distributed signal
processing. The problem of distributed or decentralized estimation has often
been considered in the context of parametric models. However, the success of
parametric methods is limited by the appropriateness of the strong statistical
assumptions made by the models. In this paper, a more flexible nonparametric
model for distributed regression is considered that is applicable in a variety
of WSN applications including field estimation. Here, starting with the
standard regularized kernel least-squares estimator, a message-passing
algorithm for distributed estimation in WSNs is derived. The algorithm can be
viewed as an instantiation of the successive orthogonal projection (SOP)
algorithm. Various practical aspects of the algorithm are discussed and several
numerical simulations validate the potential of the approach.Comment: To appear in the Proceedings of the SPIE Conference on Advanced
Signal Processing Algorithms, Architectures and Implementations XV, San
Diego, CA, July 31 - August 4, 200
On the maximum bias functions of MM-estimates and constrained M-estimates of regression
We derive the maximum bias functions of the MM-estimates and the constrained
M-estimates or CM-estimates of regression and compare them to the maximum bias
functions of the S-estimates and the -estimates of regression. In these
comparisons, the CM-estimates tend to exhibit the most favorable
bias-robustness properties. Also, under the Gaussian model, it is shown how one
can construct a CM-estimate which has a smaller maximum bias function than a
given S-estimate, that is, the resulting CM-estimate dominates the S-estimate
in terms of maxbias and, at the same time, is considerably more efficient.Comment: Published at http://dx.doi.org/10.1214/009053606000000975 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Automatic Differentiation Variational Inference
Probabilistic modeling is iterative. A scientist posits a simple model, fits
it to her data, refines it according to her analysis, and repeats. However,
fitting complex models to large data is a bottleneck in this process. Deriving
algorithms for new models can be both mathematically and computationally
challenging, which makes it difficult to efficiently cycle through the steps.
To this end, we develop automatic differentiation variational inference (ADVI).
Using our method, the scientist only provides a probabilistic model and a
dataset, nothing else. ADVI automatically derives an efficient variational
inference algorithm, freeing the scientist to refine and explore many models.
ADVI supports a broad class of models-no conjugacy assumptions are required. We
study ADVI across ten different models and apply it to a dataset with millions
of observations. ADVI is integrated into Stan, a probabilistic programming
system; it is available for immediate use
- …