Search CORE

40,432 research outputs found

An Exponential Lower Bound on the Complexity of Regularization Paths

Author: Gärtner Bernd
Jaggi Martin
Maria Clément
Publication venue
Publication date: 01/01/2012
Field of study

For a variety of regularized optimization problems in machine learning, algorithms computing the entire solution path have been developed recently. Most of these methods are quadratic programs that are parameterized by a single parameter, as for example the Support Vector Machine (SVM). Solution path algorithms do not only compute the solution for one particular value of the regularization parameter but the entire path of solutions, making the selection of an optimal parameter much easier. It has been assumed that these piecewise linear solution paths have only linear complexity, i.e. linearly many bends. We prove that for the support vector machine this complexity can be exponential in the number of training points in the worst case. More strongly, we construct a single instance of n input points in d dimensions for an SVM such that at least \Theta(2^{n/2}) = \Theta(2^d) many distinct subsets of support vectors occur as the regularization parameter changes.Comment: Journal version, 28 Pages, 5 Figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Directory of Open Access Journals

Mixed Integer Linear Programming for Feature Selection in Support Vector Machine

Author: Labbé Martine
Martínez-Merino Luisa I.
Rodríguez-Chía Antonio M.
Publication venue
Publication date: 07/08/2018
Field of study

This work focuses on support vector machine (SVM) with feature selection. A MILP formulation is proposed for the problem. The choice of suitable features to construct the separating hyperplanes has been modelled in this formulation by including a budget constraint that sets in advance a limit on the number of features to be used in the classification process. We propose both an exact and a heuristic procedure to solve this formulation in an efficient way. Finally, the validation of the model is done by checking it with some well-known data sets and comparing it with classical classification methods.Comment: 37 pages, 20 figure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

DI-fusion

Hal-Diderot

idUS. Depósito de Investigación Universidad de Sevilla

Machine learning-guided directed evolution for protein engineering

Author: Arnold Frances H.
Wu Zachary
Yang Kevin K.
Publication venue
Publication date: 19/04/2019
Field of study

Machine learning (ML)-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. ML methods use data to predict how sequence maps to function without requiring a detailed model of the underlying physics or biological pathways. To demonstrate ML-guided directed evolution, we introduce the steps required to build ML sequence-function models and use them to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to using ML for protein engineering as well as the current literature and applications of this new engineering paradigm. ML methods accelerate directed evolution by learning from information contained in all measured variants and using that information to select sequences that are likely to be improved. We then provide two case studies that demonstrate the ML-guided directed evolution process. We also look to future opportunities where ML will enable discovery of new protein functions and uncover the relationship between protein sequence and function.Comment: Made significant revisions to focus on aspects most relevant to applying machine learning to speed up directed evolutio

arXiv.org e-Print Archive

Caltech Authors

The Default Risk of Firms Examined with Smooth Support Vector Machines

Author: Dorothea Schäfer
Wolfgang Härdle
Yi-Ren Yeh
Yuh-Jye Lee
Publication venue
Publication date
Field of study

In the era of Basel II a powerful tool for bankruptcy prognosis is vital for banks. The tool must be precise but also easily adaptable to the bank's objections regarding the relation of false acceptances (Type I error) and false rejections (Type II error). We explore the suitability of Smooth Support Vector Machines (SSVM), and investigate how important factors such as selection of appropriate accounting ratios (predictors), length of training period and structure of the training sample influence the precision of prediction. Furthermore we showthat oversampling can be employed to gear the tradeoff between error types. Finally, we illustrate graphically how different variants of SSVM can be used jointly to support the decision task of loan officers.Insolvency Prognosis, SVMs, Statistical Learning Theory, Non-parametric Classification

Research Papers in Economics

The Default Risk of Firms Examined with Smooth Support Vector Machines

Author: Dorothea Schäfer
Wolfgang Härdle
Yi-Ren Yeh
Yuh-Jye Lee
Publication venue
Publication date
Field of study

In the era of Basel II a powerful tool for bankruptcy prognosis is vital for banks. The tool must be precise but also easily adaptable to the bank's objections regarding the relation of false acceptances (Type I error) and false rejections (Type II error). We explore the suitabil- ity of Smooth Support Vector Machines (SSVM), and investigate how important factors such as selection of appropriate accounting ratios (predictors), length of training period and structure of the training sample in°uence the precision of prediction. Furthermore we show that oversampling can be employed to gear the tradeo® between error types. Finally, we illustrate graphically how di®erent variants of SSVM can be used jointly to support the decision task of loan o±cers.Insolvency Prognosis, SVMs, Statistical Learning Theory, Non-parametric Classification models, local time-homogeneity

Research Papers in Economics