1,566 research outputs found
Multiclass latent locally linear support vector machines
Kernelized Support Vector Machines (SVM) have gained the status of off-the-shelf classifiers, able to deliver state of the art performance on almost any problem. Still, their practical use is constrained by their computational and memory complexity, which grows super-linearly with the number of training samples. In order to retain the low training and testing complexity of linear classifiers and the exibility of non linear ones, a growing, promising alternative is represented by methods that learn non-linear classifiers through local combinations of linear ones. In this paper we propose a new multi class local classifier, based on a latent SVM formulation. The proposed classifier makes use of a set of linear models that are linearly combined using sample and class specific weights. Thanks to the latent formulation, the combination coefficients are modeled as latent variables. We allow soft combinations and we provide a closed-form solution for their estimation, resulting in an efficient prediction rule. This novel formulation allows to learn in a principled way the sample specific weights and the linear classifiers, in a unique optimization problem, using a CCCP optimization procedure. Extensive experiments on ten standard UCI machine learning datasets, one large binary dataset, three character and digit recognition databases, and a visual place categorization dataset show the power of the proposed approach
Semi-supervised Eigenvectors for Large-scale Locally-biased Learning
In many applications, one has side information, e.g., labels that are
provided in a semi-supervised manner, about a specific target region of a large
data set, and one wants to perform machine learning and data analysis tasks
"nearby" that prespecified target region. For example, one might be interested
in the clustering structure of a data graph near a prespecified "seed set" of
nodes, or one might be interested in finding partitions in an image that are
near a prespecified "ground truth" set of pixels. Locally-biased problems of
this sort are particularly challenging for popular eigenvector-based machine
learning and data analysis tools. At root, the reason is that eigenvectors are
inherently global quantities, thus limiting the applicability of
eigenvector-based methods in situations where one is interested in very local
properties of the data.
In this paper, we address this issue by providing a methodology to construct
semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these
locally-biased eigenvectors can be used to perform locally-biased machine
learning. These semi-supervised eigenvectors capture
successively-orthogonalized directions of maximum variance, conditioned on
being well-correlated with an input seed set of nodes that is assumed to be
provided in a semi-supervised manner. We show that these semi-supervised
eigenvectors can be computed quickly as the solution to a system of linear
equations; and we also describe several variants of our basic method that have
improved scaling properties. We provide several empirical examples
demonstrating how these semi-supervised eigenvectors can be used to perform
locally-biased learning; and we discuss the relationship between our results
and recent machine learning algorithms that use global eigenvectors of the
graph Laplacian
CAFÉ-Map : context aware feature mapping for mining high dimensional biomedical data
Feature selection and ranking is of great importance in the analysis of biomedical data. In addition to reducing the number of features used in classification or other machine learning tasks, it allows us to extract meaningful biological and medical information from a machine learning model. Most existing approaches in this domain do not directly model the fact that the relative importance of features can be different in different regions of the feature space. In this work, we present a context aware feature ranking algorithm called CAFÉ-Map. CAFÉ-Map is a locally linear feature ranking framework that allows recognition of important features in any given region of the feature space or for any individual example. This allows for simultaneous classification and feature ranking in an interpretable manner. We have benchmarked CAFÉ-Map on a number of toy and real world biomedical data sets. Our comparative study with a number of published methods shows that CAFÉ-Map achieves better accuracies on these data sets. The top ranking features obtained through CAFÉ-Map in a gene profiling study correlate very well with the importance of different genes reported in the literature. Furthermore, CAFÉ-Map provides a more in-depth analysis of feature ranking at the level of individual examples
Adaptive Locality Preserving Regression
This paper proposes a novel discriminative regression method, called adaptive
locality preserving regression (ALPR) for classification. In particular, ALPR
aims to learn a more flexible and discriminative projection that not only
preserves the intrinsic structure of data, but also possesses the properties of
feature selection and interpretability. To this end, we introduce a target
learning technique to adaptively learn a more discriminative and flexible
target matrix rather than the pre-defined strict zero-one label matrix for
regression. Then a locality preserving constraint regularized by the adaptive
learned weights is further introduced to guide the projection learning, which
is beneficial to learn a more discriminative projection and avoid overfitting.
Moreover, we replace the conventional `Frobenius norm' with the special l21
norm to constrain the projection, which enables the method to adaptively select
the most important features from the original high-dimensional data for feature
extraction. In this way, the negative influence of the redundant features and
noises residing in the original data can be greatly eliminated. Besides, the
proposed method has good interpretability for features owing to the
row-sparsity property of the l21 norm. Extensive experiments conducted on the
synthetic database with manifold structure and many real-world databases prove
the effectiveness of the proposed method.Comment: The paper has been accepted by IEEE Transactions on Circuits and
Systems for Video Technology (TCSVT), and the code can be available at
https://drive.google.com/file/d/1iNzONkRByIaUhXwdEhOkkh_0d2AAXNE8/vie
- …