6,234 research outputs found

    Training linear ranking SVMs in linearithmic time using red-black trees

    Full text link
    We introduce an efficient method for training the linear ranking support vector machine. The method combines cutting plane optimization with red-black tree based approach to subgradient calculations, and has O(m*s+m*log(m)) time complexity, where m is the number of training examples, and s the average number of non-zero features per example. Best previously known training algorithms achieve the same efficiency only for restricted special cases, whereas the proposed approach allows any real valued utility scores in the training data. Experiments demonstrate the superior scalability of the proposed approach, when compared to the fastest existing RankSVM implementations.Comment: 20 pages, 4 figure

    A Feature Selection Method for Multivariate Performance Measures

    Full text link
    Feature selection with specific multivariate performance measures is the key to the success of many applications, such as image retrieval and text classification. The existing feature selection methods are usually designed for classification error. In this paper, we propose a generalized sparse regularizer. Based on the proposed regularizer, we present a unified feature selection framework for general loss functions. In particular, we study the novel feature selection paradigm by optimizing multivariate performance measures. The resultant formulation is a challenging problem for high-dimensional data. Hence, a two-layer cutting plane algorithm is proposed to solve this problem, and the convergence is presented. In addition, we adapt the proposed method to optimize multivariate measures for multiple instance learning problems. The analyses by comparing with the state-of-the-art feature selection methods show that the proposed method is superior to others. Extensive experiments on large-scale and high-dimensional real world datasets show that the proposed method outperforms l1l_1-SVM and SVM-RFE when choosing a small subset of features, and achieves significantly improved performances over SVMperf^{perf} in terms of F1F_1-score

    The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses

    Get PDF
    Learning with non-modular losses is an important problem when sets of predictions are made simultaneously. The main tools for constructing convex surrogate loss functions for set prediction are margin rescaling and slack rescaling. In this work, we show that these strategies lead to tight convex surrogates iff the underlying loss function is increasing in the number of incorrect predictions. However, gradient or cutting-plane computation for these functions is NP-hard for non-supermodular loss functions. We propose instead a novel surrogate loss function for submodular losses, the Lov\'asz hinge, which leads to O(p log p) complexity with O(p) oracle accesses to the loss function to compute a gradient or cutting-plane. We prove that the Lov\'asz hinge is convex and yields an extension. As a result, we have developed the first tractable convex surrogates in the literature for submodular losses. We demonstrate the utility of this novel convex surrogate through several set prediction tasks, including on the PASCAL VOC and Microsoft COCO datasets

    Efficient Multi-Template Learning for Structured Prediction

    Full text link
    Conditional random field (CRF) and Structural Support Vector Machine (Structural SVM) are two state-of-the-art methods for structured prediction which captures the interdependencies among output variables. The success of these methods is attributed to the fact that their discriminative models are able to account for overlapping features on the whole input observations. These features are usually generated by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate this issue, in this paper, we propose a novel multiple template learning paradigm to learn structured prediction and the importance of each template simultaneously, so that hundreds of arbitrary templates could be added into the learning model without caution. This paradigm can be formulated as a special multiple kernel learning problem with exponential number of constraints. Then we introduce an efficient cutting plane algorithm to solve this problem in the primal, and its convergence is presented. We also evaluate the proposed learning paradigm on two widely-studied structured prediction tasks, \emph{i.e.} sequence labeling and dependency parsing. Extensive experimental results show that the proposed method outperforms CRFs and Structural SVMs due to exploiting the importance of each template. Our complexity analysis and empirical results also show that our proposed method is more efficient than OnlineMKL on very sparse and high-dimensional data. We further extend this paradigm for structured prediction using generalized pp-block norm regularization with p>1p>1, and experiments show competitive performances when p∈[1,2)p \in [1,2)
    • …
    corecore