9,976 research outputs found

    Boosting through Optimization of Margin Distributions

    Full text link
    Boosting has attracted much research attention in the past decade. The success of boosting algorithms may be interpreted in terms of the margin theory. Recently it has been shown that generalization error of classifiers can be obtained by explicitly taking the margin distribution of the training data into account. Most of the current boosting algorithms in practice usually optimizes a convex loss function and do not make use of the margin distribution. In this work we design a new boosting algorithm, termed margin-distribution boosting (MDBoost), which directly maximizes the average margin and minimizes the margin variance simultaneously. This way the margin distribution is optimized. A totally-corrective optimization algorithm based on column generation is proposed to implement MDBoost. Experiments on UCI datasets show that MDBoost outperforms AdaBoost and LPBoost in most cases.Comment: 9 pages. To publish/Published in IEEE Transactions on Neural Networks, 21(7), July 201

    RandomBoost: Simplified Multi-class Boosting through Randomization

    Full text link
    We propose a novel boosting approach to multi-class classification problems, in which multiple classes are distinguished by a set of random projection matrices in essence. The approach uses random projections to alleviate the proliferation of binary classifiers typically required to perform multi-class classification. The result is a multi-class classifier with a single vector-valued parameter, irrespective of the number of classes involved. Two variants of this approach are proposed. The first method randomly projects the original data into new spaces, while the second method randomly projects the outputs of learned weak classifiers. These methods are not only conceptually simple but also effective and easy to implement. A series of experiments on synthetic, machine learning and visual recognition data sets demonstrate that our proposed methods compare favorably to existing multi-class boosting algorithms in terms of both the convergence rate and classification accuracy.Comment: 15 page

    A Primal-Dual Convergence Analysis of Boosting

    Full text link
    Boosting combines weak learners into a predictor with low empirical risk. Its dual constructs a high entropy distribution upon which weak learners and training labels are uncorrelated. This manuscript studies this primal-dual relationship under a broad family of losses, including the exponential loss of AdaBoost and the logistic loss, revealing: - Weak learnability aids the whole loss family: for any {\epsilon}>0, O(ln(1/{\epsilon})) iterations suffice to produce a predictor with empirical risk {\epsilon}-close to the infimum; - The circumstances granting the existence of an empirical risk minimizer may be characterized in terms of the primal and dual problems, yielding a new proof of the known rate O(ln(1/{\epsilon})); - Arbitrary instances may be decomposed into the above two, granting rate O(1/{\epsilon}), with a matching lower bound provided for the logistic loss.Comment: 40 pages, 8 figures; the NIPS 2011 submission "The Fast Convergence of Boosting" is a brief presentation of the primary results; compared with the JMLR version, this arXiv version has hyperref and some formatting tweak

    Totally Corrective Multiclass Boosting with Binary Weak Learners

    Full text link
    In this work, we propose a new optimization framework for multiclass boosting learning. In the literature, AdaBoost.MO and AdaBoost.ECC are the two successful multiclass boosting algorithms, which can use binary weak learners. We explicitly derive these two algorithms' Lagrange dual problems based on their regularized loss functions. We show that the Lagrange dual formulations enable us to design totally-corrective multiclass algorithms by using the primal-dual optimization technique. Experiments on benchmark data sets suggest that our multiclass boosting can achieve a comparable generalization capability with state-of-the-art, but the convergence speed is much faster than stage-wise gradient descent boosting. In other words, the new totally corrective algorithms can maximize the margin more aggressively.Comment: 11 page
    • …
    corecore