153,252 research outputs found

    kernlab - An S4 Package for Kernel Methods in R

    Get PDF
    kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 ob ject model and provides a framework for creating and using kernel-based algorithms. The package contains dot product primitives (kernels), implementations of support vector machines and the relevance vector machine, Gaussian processes, a ranking algorithm, kernel PCA, kernel CCA, and a spectral clustering algorithm. Moreover it provides a general purpose quadratic programming solver, and an incomplete Cholesky decomposition method.

    Protein sequences classification based on weighting scheme

    Get PDF
    We present a new technique to recognize remote protein homologies that rely on combining probabilistic modeling and supervised learning in high-dimensional feature spaces. The main novelty of our technique is the method of constructing feature vectors using Hidden Markov Model and the combination of this representation with a classifier capable of learning in very sparse high-dimensional spaces. Each feature vector records the sensitivity of each protein domain to a previously learned set of sub-sequences (strings). Unlike other previous methods, our method takes in consideration the conserved and non-conserved regions. The system subsequently utilizes Support Vector Machines (SVM) classifiers to learn the boundaries between structural protein classes. Experiments show that this method, which we call the String Weighting Scheme-SVM (SWS-SVM) method, significantly improves on previous methods for the classification of protein domains based on remote homologies. Our method is then compared to five existing homology detection methods

    Research of Financial Early-Warning Model on Evolutionary Support Vector Machines Based on Genetic Algorithms

    Get PDF
    A support vector machine is a new learning machine; it is based on the statistics learning theory and attracts the attention of all researchers. Recently, the support vector machines (SVMs) have been applied to the problem of financial early-warning prediction (Rose, 1999). The SVMs-based method has been compared with other statistical methods and has shown good results. But the parameters of the kernel function which influence the result and performance of support vector machines have not been decided. Based on genetic algorithms, this paper proposes a new scientific method to automatically select the parameters of SVMs for financial early-warning model. The results demonstrate that the method is a powerful and flexible way to solve financial early-warning problem

    A Simple Method For Estimating Conditional Probabilities For SVMs

    Get PDF
    Support Vector Machines (SVMs) have become a popular learning algorithm, in particular for large, high-dimensional classification problems. SVMs have been shown to give most accurate classification results in a variety of applications. Several methods have been proposed to obtain not only a classification, but also an estimate of the SVMs confidence in the correctness of the predicted label. In this paper, several algorithms are compared which scale the SVM decision function to obtain an estimate of the conditional class probability. A new simple and fast method is derived from theoretical arguments and empirically compared to the existing approaches. --

    DME Handout: Support Vector Machines School of Informatics, University of

    Get PDF
    Support Vector Machines (SVMs) are a relatively new concept in supervised learning, but since the publication of [3] in 1995 they have been applied to a wide variety of problems. In many ways the application of SVMs to almost any learning problem mirrors the enthusiasm (and fashionability) that was observed for neural networks in the second half of the 1980’s. The ingredients of the SVM had, in fact, been around for a decade or so, but they were not put together until the early 90’s. The two key ideas of support vector machines are (i) The maximum margin solution for a linear classifier. (ii) The “kernel trick”; a method of expanding up from a linear classifier to a non-linear one in an efficient manner. Below we discuss these key ideas in turn, and then go on to consider support vector regression and some example applications of SVMs. Further reading on the topic can be found in [2], [7] and [4]. For those keen to keep up with the latest results, the web sit

    Prediction of protein binding sites in protein structures using hidden Markov support vector machine

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance.</p> <p>Results</p> <p>In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods.</p> <p>Conclusion</p> <p>The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.</p
    corecore