8,146 research outputs found
Sparse modeling of categorial explanatory variables
Shrinking methods in regression analysis are usually designed for metric
predictors. In this article, however, shrinkage methods for categorial
predictors are proposed. As an application we consider data from the Munich
rent standard, where, for example, urban districts are treated as a categorial
predictor. If independent variables are categorial, some modifications to usual
shrinking procedures are necessary. Two -penalty based methods for factor
selection and clustering of categories are presented and investigated. The
first approach is designed for nominal scale levels, the second one for ordinal
predictors. Besides applying them to the Munich rent standard, methods are
illustrated and compared in simulation studies.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS355 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
The Gaussian process is a standard tool for building emulators for both
deterministic and stochastic computer experiments. However, application of
Gaussian process models is greatly limited in practice, particularly for
large-scale and many-input computer experiments that have become typical. We
propose a multi-resolution functional ANOVA model as a computationally feasible
emulation alternative. More generally, this model can be used for large-scale
and many-input non-linear regression problems. An overlapping group lasso
approach is used for estimation, ensuring computational feasibility in a
large-scale and many-input setting. New results on consistency and inference
for the (potentially overlapping) group lasso in a high-dimensional setting are
developed and applied to the proposed multi-resolution functional ANOVA model.
Importantly, these results allow us to quantify the uncertainty in our
predictions. Numerical examples demonstrate that the proposed model enjoys
marked computational advantages. Data capabilities, both in terms of sample
size and dimension, meet or exceed best available emulation tools while meeting
or exceeding emulation accuracy
Structure Learning of Partitioned Markov Networks
We learn the structure of a Markov Network between two groups of random
variables from joint observations. Since modelling and learning the full MN
structure may be hard, learning the links between two groups directly may be a
preferable option. We introduce a novel concept called the \emph{partitioned
ratio} whose factorization directly associates with the Markovian properties of
random variables across two groups. A simple one-shot convex optimization
procedure is proposed for learning the \emph{sparse} factorizations of the
partitioned ratio and it is theoretically guaranteed to recover the correct
inter-group structure under mild conditions. The performance of the proposed
method is experimentally compared with the state of the art MN structure
learning methods using ROC curves. Real applications on analyzing
bipartisanship in US congress and pairwise DNA/time-series alignments are also
reported.Comment: Camera Ready for ICML 2016. Fixed some minor typo
- …