72,433 research outputs found

    Gaussian processes and SVM: Mean field results and leave-one-out

    Get PDF
    In this chapter, we elaborate on the well-known relationship between Gaussian processes (GP) and Support Vector Machines (SVM). Secondly, we present approximate solutions for two computational problems arising in GP and SVM. The first one is the calculation of the posterior mean for GP classifiers using a `naive' mean field approach. The second one is a leave-one-out estimator for the generalization error of SVM based on a linear response method. Simulation results on a benchmark dataset show similar performances for the GP mean field algorithm and the SVM algorithm. The approximate leave-one-out estimator is found to be in very good agreement with the exact leave-one-out error

    Separable Convex Optimization with Nested Lower and Upper Constraints

    Full text link
    We study a convex resource allocation problem in which lower and upper bounds are imposed on partial sums of allocations. This model is linked to a large range of applications, including production planning, speed optimization, stratified sampling, support vector machines, portfolio management, and telecommunications. We propose an efficient gradient-free divide-and-conquer algorithm, which uses monotonicity arguments to generate valid bounds from the recursive calls, and eliminate linking constraints based on the information from sub-problems. This algorithm does not need strict convexity or differentiability. It produces an ϵ\epsilon-approximate solution for the continuous problem in O(nlogmlognBϵ)\mathcal{O}(n \log m \log \frac{n B}{\epsilon}) time and an integer solution in O(nlogmlogB)\mathcal{O}(n \log m \log B) time, where nn is the number of decision variables, mm is the number of constraints, and BB is the resource bound. A complexity of O(nlogm)\mathcal{O}(n \log m) is also achieved for the linear and quadratic cases. These are the best complexities known to date for this important problem class. Our experimental analyses confirm the good performance of the method, which produces optimal solutions for problems with up to 1,000,000 variables in a few seconds. Promising applications to the support vector ordinal regression problem are also investigated

    Training Support Vector Machines Using Frank-Wolfe Optimization Methods

    Full text link
    Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of Core Vector Machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a Minimal Enclosing Ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs and can thus be used for a wider set of problems

    Fast SVM training using approximate extreme points

    Full text link
    Applications of non-linear kernel Support Vector Machines (SVMs) to large datasets is seriously hampered by its excessive training time. We propose a modification, called the approximate extreme points support vector machine (AESVM), that is aimed at overcoming this burden. Our approach relies on conducting the SVM optimization over a carefully selected subset, called the representative set, of the training dataset. We present analytical results that indicate the similarity of AESVM and SVM solutions. A linear time algorithm based on convex hulls and extreme points is used to compute the representative set in kernel space. Extensive computational experiments on nine datasets compared AESVM to LIBSVM \citep{LIBSVM}, CVM \citep{Tsang05}, BVM \citep{Tsang07}, LASVM \citep{Bordes05}, SVMperf\text{SVM}^{\text{perf}} \citep{Joachims09}, and the random features method \citep{rahimi07}. Our AESVM implementation was found to train much faster than the other methods, while its classification accuracy was similar to that of LIBSVM in all cases. In particular, for a seizure detection dataset, AESVM training was almost 10310^3 times faster than LIBSVM and LASVM and more than forty times faster than CVM and BVM. Additionally, AESVM also gave competitively fast classification times.Comment: The manuscript in revised form has been submitted to J. Machine Learning Researc

    The Case for Approximate Intermittent Computing

    Get PDF
    We present the concept of approximate intermittent computing and concretely demonstrate its application. Intermittent computations stem from the erratic energy patterns caused by energy harvesting: computations unpredictably terminate whenever energy is insufficient and the application state is lost. Existing solutions maintain equivalence to continuous executions by creating persistent state on non-volatile memory, enabling stateful computations to cross power failures. The performance penalty is massive: system throughput reduces while energy consumption increases. In contrast, approximate intermittent computations trade the accuracy of the results for sparing the entire overhead to maintain equivalence to a continuous execution. This is possible as we use approximation to limit the extent of stateful computations to the single power cycle, enabling the system to completely shift the energy budget for managing persistent state to useful computations towards an immediate approximate result. To this end, we effectively reverse the regular formulation of approximate computing problems. First, we apply approximate intermittent computing to human activity recognition. We design an anytime variation of support vector machines able to improve the accuracy of the classification as energy is available. We build a hw/sw prototype using kinetic energy and show a 7x improvement in system throughput compared to state-of-the-art system support for intermittent computing, while retaining 83% accuracy in a setting where the best attainable accuracy is 88%. Next, we apply approximate intermittent computing in a sharply different scenario, that is, embedded image processing, using loop perforation. Using a different hw/sw prototype we build and diverse energy traces, we show a 5x improvement in system throughput compared to state-of-the-art system support for intermittent computing, while providing an equivalent output in 84% of the cases

    A Divide-and-Conquer Solver for Kernel Support Vector Machines

    Full text link
    The kernel support vector machine (SVM) is one of the most widely used classification methods; however, the amount of computation required becomes the bottleneck when facing millions of samples. In this paper, we propose and analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the division step, we partition the kernel SVM problem into smaller subproblems by clustering the data, so that each subproblem can be solved independently and efficiently. We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering. In the conquer step, the local solutions from the subproblems are used to initialize a global coordinate descent solver, which converges quickly as suggested by our analysis. By extending this idea, we develop a multilevel Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction strategy, which outperforms state-of-the-art methods in terms of training speed, testing accuracy, and memory usage. As an example, on the covtype dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in obtaining the exact SVM solution (to within 10610^{-6} relative error) which achieves 96.15% prediction accuracy. Moreover, with our proposed early prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes, which is more than 100 times faster than LIBSVM

    Quadratic convergence of smoothing Newton's method for 0/1 loss optimization

    Get PDF
    It has been widely recognized that the 0/1-loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are often able to provide approximate solutions of good quality. However, those methods are not optimizing the 0/1 loss function directly and hence no optimality has been established for the original problem. This paper aims to study the optimality conditions of the 0/1 function minimization and for the first time to develop Newton's method that directly optimizes the 0/1 function with a local quadratic convergence under reasonable conditions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods
    corecore