8,743 research outputs found

    Efficient Optimization of Performance Measures by Classifier Adaptation

    Full text link
    In practical applications, machine learning algorithms are often needed to learn classifiers that optimize domain specific performance measures. Previously, the research has focused on learning the needed classifier in isolation, yet learning nonlinear classifier for nonlinear and nonsmooth performance measures is still hard. In this paper, rather than learning the needed classifier by optimizing specific performance measure directly, we circumvent this problem by proposing a novel two-step approach called as CAPO, namely to first train nonlinear auxiliary classifiers with existing learning methods, and then to adapt auxiliary classifiers for specific performance measures. In the first step, auxiliary classifiers can be obtained efficiently by taking off-the-shelf learning algorithms. For the second step, we show that the classifier adaptation problem can be reduced to a quadratic program problem, which is similar to linear SVMperf and can be efficiently solved. By exploiting nonlinear auxiliary classifiers, CAPO can generate nonlinear classifier which optimizes a large variety of performance measures including all the performance measure based on the contingency table and AUC, whilst keeping high computational efficiency. Empirical studies show that CAPO is effective and of high computational efficiency, and even it is more efficient than linear SVMperf.Comment: 30 pages, 5 figures, to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence, 201

    A Feature Selection Method for Multivariate Performance Measures

    Full text link
    Feature selection with specific multivariate performance measures is the key to the success of many applications, such as image retrieval and text classification. The existing feature selection methods are usually designed for classification error. In this paper, we propose a generalized sparse regularizer. Based on the proposed regularizer, we present a unified feature selection framework for general loss functions. In particular, we study the novel feature selection paradigm by optimizing multivariate performance measures. The resultant formulation is a challenging problem for high-dimensional data. Hence, a two-layer cutting plane algorithm is proposed to solve this problem, and the convergence is presented. In addition, we adapt the proposed method to optimize multivariate measures for multiple instance learning problems. The analyses by comparing with the state-of-the-art feature selection methods show that the proposed method is superior to others. Extensive experiments on large-scale and high-dimensional real world datasets show that the proposed method outperforms l1l_1-SVM and SVM-RFE when choosing a small subset of features, and achieves significantly improved performances over SVMperf^{perf} in terms of F1F_1-score

    Risk Assessment for Venous Thromboembolism in Chemotherapy-Treated Ambulatory Cancer Patients: A Machine Learning Approach

    Get PDF
    OBJECTIVE: To design a precision medicine approach aimed at exploiting significant patterns in data, in order to produce venous thromboembolism (VTE) risk predictors for cancer outpatients that might be of advantage over the currently recommended model (Khorana score). DESIGN: Multiple kernel learning (MKL) based on support vector machines and random optimization (RO) models were used to produce VTE risk predictors (referred to as machine learning [ML]-RO) yielding the best classification performance over a training (3-fold cross-validation) and testing set. RESULTS: Attributes of the patient data set ( n = 1179) were clustered into 9 groups according to clinical significance. Our analysis produced 6 ML-RO models in the training set, which yielded better likelihood ratios (LRs) than baseline models. Of interest, the most significant LRs were observed in 2 ML-RO approaches not including the Khorana score (ML-RO-2: positive likelihood ratio [+LR] = 1.68, negative likelihood ratio [-LR] = 0.24; ML-RO-3: +LR = 1.64, -LR = 0.37). The enhanced performance of ML-RO approaches over the Khorana score was further confirmed by the analysis of the areas under the Precision-Recall curve (AUCPR), and the approaches were superior in the ML-RO approaches (best performances: ML-RO-2: AUCPR = 0.212; ML-RO-3-K: AUCPR = 0.146) compared with the Khorana score (AUCPR = 0.096). Of interest, the best-fitting model was ML-RO-2, in which blood lipids and body mass index/performance status retained the strongest weights, with a weaker association with tumor site/stage and drugs. CONCLUSIONS: Although the monocentric validation of the presented predictors might represent a limitation, these results demonstrate that a model based on MKL and RO may represent a novel methodological approach to derive VTE risk classifiers. Moreover, this study highlights the advantages of optimizing the relative importance of groups of clinical attributes in the selection of VTE risk predictors

    Learning Discriminative Stein Kernel for SPD Matrices and Its Applications

    Full text link
    Stein kernel has recently shown promising performance on classifying images represented by symmetric positive definite (SPD) matrices. It evaluates the similarity between two SPD matrices through their eigenvalues. In this paper, we argue that directly using the original eigenvalues may be problematic because: i) Eigenvalue estimation becomes biased when the number of samples is inadequate, which may lead to unreliable kernel evaluation; ii) More importantly, eigenvalues only reflect the property of an individual SPD matrix. They are not necessarily optimal for computing Stein kernel when the goal is to discriminate different sets of SPD matrices. To address the two issues in one shot, we propose a discriminative Stein kernel, in which an extra parameter vector is defined to adjust the eigenvalues of the input SPD matrices. The optimal parameter values are sought by optimizing a proxy of classification performance. To show the generality of the proposed method, three different kernel learning criteria that are commonly used in the literature are employed respectively as a proxy. A comprehensive experimental study is conducted on a variety of image classification tasks to compare our proposed discriminative Stein kernel with the original Stein kernel and other commonly used methods for evaluating the similarity between SPD matrices. The experimental results demonstrate that, the discriminative Stein kernel can attain greater discrimination and better align with classification tasks by altering the eigenvalues. This makes it produce higher classification performance than the original Stein kernel and other commonly used methods.Comment: 13 page

    Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

    Full text link
    In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance
    • …
    corecore