15,621 research outputs found
A novel framework for parsimonious multivariate analysis
This paper proposes a framework in which a multivariate analysis method (MVA) guides a selection of input variables that leads to a sparse feature extraction. This framework, called parsimonious MVA, is specially suited for high dimensional data such as gene arrays, digital pictures, etc. The feature selection relies on the analysis of consistency in the behaviour of the input variables through the elements of an ensemble of MVA projection matrices. The ensemble is constructed following a bootstrap that builds on an efficient and generalized MVA formulation that covers PCA, CCA and OPLS. Moreover, it allows the estimation of the relative relevance of each selected input variable. Experimental results point out that the features extracted by the parsimonious MVA have excellent discrimination power, comparing favorably with state-of-the-art methods, and are potentially useful to build interpretable features. Besides, the parsimonious feature extractor is shown to be robust against to parameter selection, as we all computationally efficient.This work has been partly funded by the Spanish MINECO grant TEC2014-52289R and TEC2013-48439-C4-1-R. The authors want to thank the action editor and the reviewers for their valuable feedback
Restricted Covariance Priors with Applications in Spatial Statistics
We present a Bayesian model for area-level count data that uses Gaussian
random effects with a novel type of G-Wishart prior on the inverse
variance--covariance matrix. Specifically, we introduce a new distribution
called the truncated G-Wishart distribution that has support over precision
matrices that lead to positive associations between the random effects of
neighboring regions while preserving conditional independence of
non-neighboring regions. We describe Markov chain Monte Carlo sampling
algorithms for the truncated G-Wishart prior in a disease mapping context and
compare our results to Bayesian hierarchical models based on intrinsic
autoregression priors. A simulation study illustrates that using the truncated
G-Wishart prior improves over the intrinsic autoregressive priors when there
are discontinuities in the disease risk surface. The new model is applied to an
analysis of cancer incidence data in Washington State.Comment: Published at http://dx.doi.org/10.1214/14-BA927 in the Bayesian
Analysis (http://projecteuclid.org/euclid.ba) by the International Society of
Bayesian Analysis (http://bayesian.org/
Model-based clustering via linear cluster-weighted models
A novel family of twelve mixture models with random covariates, nested in the
linear cluster-weighted model (CWM), is introduced for model-based
clustering. The linear CWM was recently presented as a robust alternative
to the better known linear Gaussian CWM. The proposed family of models provides
a unified framework that also includes the linear Gaussian CWM as a special
case. Maximum likelihood parameter estimation is carried out within the EM
framework, and both the BIC and the ICL are used for model selection. A simple
and effective hierarchical random initialization is also proposed for the EM
algorithm. The novel model-based clustering technique is illustrated in some
applications to real data. Finally, a simulation study for evaluating the
performance of the BIC and the ICL is presented
- …