9,506 research outputs found

    Perceptron learning with random coordinate descent

    Get PDF
    A perceptron is a linear threshold classifier that separates examples with a hyperplane. It is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of random coordinate descent algorithms for perceptron learning on binary classification problems. Unlike most perceptron learning algorithms which require smooth cost functions, our algorithms directly minimize the training error, and usually achieve the lowest training error compared with other algorithms. The algorithms are also computational efficient. Such advantages make them favorable for both standalone use and ensemble learning, on problems that are not linearly separable. Experiments show that our algorithms work very well with AdaBoost, and achieve the lowest test errors for half of the datasets

    From Cutting Planes Algorithms to Compression Schemes and Active Learning

    Get PDF
    Cutting-plane methods are well-studied localization(and optimization) algorithms. We show that they provide a natural framework to perform machinelearning ---and not just to solve optimization problems posed by machinelearning--- in addition to their intended optimization use. In particular, theyallow one to learn sparse classifiers and provide good compression schemes.Moreover, we show that very little effort is required to turn them intoeffective active learning methods. This last property provides a generic way todesign a whole family of active learning algorithms from existing passivemethods. We present numerical simulations testifying of the relevance ofcutting-plane methods for passive and active learning tasks.Comment: IJCNN 2015, Jul 2015, Killarney, Ireland. 2015, \<http://www.ijcnn.org/\&g

    Optimizing 0/1 Loss for Perceptrons by Random Coordinate Descent

    Get PDF
    The 0/1 loss is an important cost function for perceptrons. Nevertheless it cannot be easily minimized by most existing perceptron learning algorithms. In this paper, we propose a family of random coordinate descent algorithms to directly minimize the 0/1 loss for perceptrons, and prove their convergence. Our algorithms are computationally efficient, and usually achieve the lowest 0/1 loss compared with other algorithms. Such advantages make them favorable for nonseparable real-world problems. Experiments show that our algorithms are especially useful for ensemble learning, and could achieve the lowest test error for many complex data sets when coupled with AdaBoost

    Statistical Mechanics of Soft Margin Classifiers

    Full text link
    We study the typical learning properties of the recently introduced Soft Margin Classifiers (SMCs), learning realizable and unrealizable tasks, with the tools of Statistical Mechanics. We derive analytically the behaviour of the learning curves in the regime of very large training sets. We obtain exponential and power laws for the decay of the generalization error towards the asymptotic value, depending on the task and on general characteristics of the distribution of stabilities of the patterns to be learned. The optimal learning curves of the SMCs, which give the minimal generalization error, are obtained by tuning the coefficient controlling the trade-off between the error and the regularization terms in the cost function. If the task is realizable by the SMC, the optimal performance is better than that of a hard margin Support Vector Machine and is very close to that of a Bayesian classifier.Comment: 26 pages, 12 figures, submitted to Physical Review
    corecore