1,143 research outputs found

    Parsimonious Mahalanobis Kernel for the Classification of High Dimensional Data

    Full text link
    The classification of high dimensional data with kernel methods is considered in this article. Exploit- ing the emptiness property of high dimensional spaces, a kernel based on the Mahalanobis distance is proposed. The computation of the Mahalanobis distance requires the inversion of a covariance matrix. In high dimensional spaces, the estimated covariance matrix is ill-conditioned and its inversion is unstable or impossible. Using a parsimonious statistical model, namely the High Dimensional Discriminant Analysis model, the specific signal and noise subspaces are estimated for each considered class making the inverse of the class specific covariance matrix explicit and stable, leading to the definition of a parsimonious Mahalanobis kernel. A SVM based framework is used for selecting the hyperparameters of the parsimonious Mahalanobis kernel by optimizing the so-called radius-margin bound. Experimental results on three high dimensional data sets show that the proposed kernel is suitable for classifying high dimensional data, providing better classification accuracies than the conventional Gaussian kernel

    Duality, Derivative-Based Training Methods and Hyperparameter Optimization for Support Vector Machines

    Get PDF
    In this thesis we consider the application of Fenchel's duality theory and gradient-based methods for the training and hyperparameter optimization of Support Vector Machines. We show that the dualization of convex training problems is possible theoretically in a rather general formulation. For training problems following a special structure (for instance, standard training problems) we find that the resulting optimality conditions can be interpreted concretely. This approach immediately leads to the well-known notion of support vectors and a formulation of the Representer Theorem. The proposed theory is applied to several examples such that dual formulations of training problems and associated optimality conditions can be derived straightforwardly. Furthermore, we consider different formulations of the primal training problem which are equivalent under certain conditions. We also argue that the relation of the corresponding solutions to the solution of the dual training problem is not always intuitive. Based on the previous findings, we consider the application of customized optimization methods to the primal and dual training problems. A particular realization of Newton's method is derived which could be used to solve the primal training problem accurately. Moreover, we introduce a general convergence framework covering different types of decomposition methods for the solution of the dual training problem. In doing so, we are able to generalize well-known convergence results for the SMO method. Additionally, a discussion of the complexity of the SMO method and a motivation for a shrinking strategy reducing the computational effort is provided. In a last theoretical part, we consider the problem of hyperparameter optimization. We argue that this problem can be handled efficiently by means of gradient-based methods if the training problems are formulated appropriately. Finally, we evaluate the theoretical results concerning the training and hyperparameter optimization approaches practically by means of several example training problems

    Large-scale Machine Learning in High-dimensional Datasets

    Get PDF
    • …
    corecore