3,538 research outputs found
Selection of Ordinally Scaled Independent Variables
Ordinal categorial variables are a common case in regression
modeling. Although the case of ordinal response variables has been well investigated, less work has been done concerning ordinal predictors. This article deals with the selection of ordinally scaled independent variables in the classical linear model, where the ordinal structure is taken into account by use of a difference penalty on adjacent dummy coefficients. It is shown how the Group Lasso can be used for the selection of ordinal predictors, and an alternative blockwise Boosting procedure is proposed. Emphasis is placed on the application of the presented methods to the (Comprehensive) ICF Core Set for chronic widespread pain.
The paper is a preprint of an article accepted for publication in the Journal of the Royal Statistical Society Series C (Applied Statistics). Please use the journal version for citation
Similarity-based and Iterative Label Noise Filters for Monotonic Classification
Monotonic ordinal classification has received an increasing interest in the latest years. Building monotone models from these problems usually requires datasets that verify monotonic relationships among the samples. When the monotonic relationships are not met, changing the labels may be a viable option, but the risk is high: wrong label changes would completely change the information contained in the data. In this work, we tackle the construction of monotone datasets by removing the wrong or noisy examples that violate monotonicity restrictions. We propose two monotonic noise filtering algorithms to preprocess the ordinal datasets and improve the monotonic relations between instances. The experiments are carried out over eleven ordinal datasets, showing that the application of the proposed filters improve the prediction capabilities over different levels of noise
Ordinal Regression by Extended Binary Classification
We present a reduction framework from ordinal regression to binary classification based on extended examples. The framework consists of three steps: extracting
extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a
ranking rule from the binary classifier. A weighted 0/1 loss of the binary classifier would then bound the mislabeling cost of the ranking rule. Our framework
allows not only to design good ordinal regression algorithms based on well-tuned binary classification approaches, but also to derive new generalization bounds for
ordinal regression from known bounds for binary classification. In addition, our framework unifies many existing ordinal regression algorithms, such as perceptron
ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages
in terms of both training speed and generalization performance over existing algorithms, which demonstrates the usefulness of our framework
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
We consider the problems of estimation and selection of parameters endowed
with a known group structure, when the groups are assumed to be sign-coherent,
that is, gathering either nonnegative, nonpositive or null parameters. To
tackle this problem, we propose the cooperative-Lasso penalty. We derive the
optimality conditions defining the cooperative-Lasso estimate for generalized
linear models, and propose an efficient active set algorithm suited to
high-dimensional problems. We study the asymptotic consistency of the estimator
in the linear regression setup and derive its irrepresentable conditions, which
are milder than the ones of the group-Lasso regarding the matching of groups
with the sparsity pattern of the true parameters. We also address the problem
of model selection in linear regression by deriving an approximation of the
degrees of freedom of the cooperative-Lasso estimator. Simulations comparing
the proposed estimator to the group and sparse group-Lasso comply with our
theoretical results, showing consistent improvements in support recovery for
sign-coherent groups. We finally propose two examples illustrating the wide
applicability of the cooperative-Lasso: first to the processing of ordinal
variables, where the penalty acts as a monotonicity prior; second to the
processing of genomic data, where the set of differentially expressed probes is
enriched by incorporating all the probes of the microarray that are related to
the corresponding genes.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS520 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …