51,186 research outputs found
A Feature Selection Method for Multivariate Performance Measures
Feature selection with specific multivariate performance measures is the key
to the success of many applications, such as image retrieval and text
classification. The existing feature selection methods are usually designed for
classification error. In this paper, we propose a generalized sparse
regularizer. Based on the proposed regularizer, we present a unified feature
selection framework for general loss functions. In particular, we study the
novel feature selection paradigm by optimizing multivariate performance
measures. The resultant formulation is a challenging problem for
high-dimensional data. Hence, a two-layer cutting plane algorithm is proposed
to solve this problem, and the convergence is presented. In addition, we adapt
the proposed method to optimize multivariate measures for multiple instance
learning problems. The analyses by comparing with the state-of-the-art feature
selection methods show that the proposed method is superior to others.
Extensive experiments on large-scale and high-dimensional real world datasets
show that the proposed method outperforms -SVM and SVM-RFE when choosing a
small subset of features, and achieves significantly improved performances over
SVM in terms of -score
-means clustering of extremes
The -means clustering algorithm and its variant, the spherical -means
clustering, are among the most important and popular methods in unsupervised
learning and pattern detection. In this paper, we explore how the spherical
-means algorithm can be applied in the analysis of only the extremal
observations from a data set. By making use of multivariate extreme value
analysis we show how it can be adopted to find "prototypes" of extremal
dependence and we derive a consistency result for our suggested estimator. In
the special case of max-linear models we show furthermore that our procedure
provides an alternative way of statistical inference for this class of models.
Finally, we provide data examples which show that our method is able to find
relevant patterns in extremal observations and allows us to classify extremal
events
- …