128,512 research outputs found

    Combining Kernel Functions in Supervised Learning Models.

    Get PDF
    The research activity has mainly dealt with supervised Machine Learning algorithms, specifically within the context of kernel methods. A kernel function is a positive definite function mapping data from the original input space into a higher dimensional Hilbert space. Differently from classical linear methods, where problems are solved seeking for a linear function separating points in the input space, kernel methods all have in common the same basic focus: original input data is mapped onto a higher dimensional feature set where new coordinates are not computed, but only the inner product of input points. In this way, kernel methods make possible to deal with non-linearly separable set of data, making use of linear models in the feature space: all the Machine Learning methods using a linear function to determine the best fitting for a set of given data. Instead of employing one single kernel function, Multiple Kernel Learning algorithms tackle the problem of selecting kernel functions by using a combination of preset base kernels. Infinite Kernel Learning further extends such idea by exploiting a combination of possibly infinite base kernels. The research activity core idea is utilize a novel complex combination of kernel functions in already existing or modified supervised Machine Learning frameworks. Specifically, we considered two frameworks: Extreme Learning Machine, having the structure of classical feedforward Neural Networks but being characterized by hidden nodes variables randomly assigned at the beginning of the algorithm; Support Vector Machine, a class of linear algorithms based on the idea of separating data with a hyperplane having as wide a margin as possible. The first proposed model extends the classical Extreme Learning Machine formulation using a combination of possibly infinitely many base kernel, presenting a two-step algorithm. The second result uses a preexisting multi-task kernel function in a novel Support Vector Machine framework. Multi-task learning defines the Machine Learning problem of solving more than one task at the same time, with the main goal of taking into account the existing multi-task relationships. To be able to use the existing multi-task kernel function, we had to construct a new framework based on the classical Support Vector Machine one, taking care of every multi-task correlation factor

    Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi- class Classification Problems

    Get PDF
    Project Objective: The objective of this project is to develop a Bayesian kernel model built around non- Gaussian prior distributions to address binary and multi-class classification problems.Recent advances in data mining have integrated kernel functions with Bayesian probabilistic analysis of Gaussian distributions. These machine learning approaches can incorporate prior information with new data to calculate probabilistic rather than deterministic values for unknown parameters. This paper analyzes extensively a specific Bayesian kernel model that uses a kernel function to calculate a posterior beta distribution that is conjugate to the prior beta distribution. Numerical testing of the beta kernel model on several benchmark data sets reveal that this model’s accuracy is comparable with those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms. When one class occurs much more frequently than the other class, the beta kernel model often outperforms other strategies to handle imbalanced data sets. If data arrive sequentially over time, the beta kernel model easily and quickly updates the probability distribution, and this model is more accurate than an incremental support vector machine algorithm for online learning when fewer than 50 data points are available.U.S. Army Research OfficeSponsor/Monitor's Report Number(s): 61414-MA-II.3W911NF-12-1-040

    in-depth analysis of SVM kernel learning and its components

    Get PDF
    The performance of support vector machines in non-linearly-separable classification problems strongly relies on the kernel function. Towards an automatic machine learning approach for this technique, many research outputs have been produced dealing with the challenge of automatic learn- ing of good-performing kernels for support vector machines. However, these works have been carried out without a thorough analysis of the set of components that influence the behavior of support vector machines and their interaction with the kernel. These components are related in an in- tricate way and it is difficult to provide a comprehensible analysis of their joint effect. In this paper we try to fill this gap introducing the necessary steps in order to understand these interactions and provide clues for the research community to know where to place the emphasis. First of all, we identify all the factors that affect the final performance of support vector machines in relation to the elicitation of kernels. Next, we analyze the factors independently or in pairs and study the influence each component has on the final classification performance, providing recommendations and insights into the kernel setting for support vector machines.IT1244-19 PID2019-104966GB-I0

    Prediction of student’s academic performance during online learning based on regression in support vector machine

    Get PDF
    Since the Movement Control Order (MCO) was adopted, all the universities have implemented and modified the principle of online learning and teaching in consequence of Covid-19. This situation has relatively affected the students’ academic performance. Therefore, this paper employs the regression method in Support Vector Machine (SVM) to investigate the prediction of students’ academic performance in online learning during the Covid-19 pandemic. The data was collected from undergraduate students of the Department of Mathematics, Faculty of Science and Mathematics, Sultan Idris Education University (UPSI). Students’ Cumulative Grade Point Average (CGPA) during online learning indicates their academic performance. The algorithm of Support Vector Machine (SVM) as a machine learning was employed to construct a prediction model of students’ academic performance., Two parameters, namely C (cost) and epsilon of the Support Vector Machine (SVM) algorithm should be identified first prior to further analysis. The best parameter C (cost) and epsilon in SVM regression are 4 and 0.8. The parameters then were used for four kernels, i.e., radial basis function kernel, linear kernel, polynomial kernel, and sigmoid kernel. from the findings, the finest type of kernel is the radial basis function kernel, with the lowest support vector value and the lowest Root Mean Square Error (RMSE) which are 27 and 0.2557. Based on the research, the results show that the pattern of prediction of students’ academic performance is similar to the current CGPA. Therefore, Support Vector Machine regression can predict students’ academic performance

    ANALISIS AKURASI DARI PERBEDAAN FUNGSI KERNEL DAN COST PADA SUPPORT VECTOR MACHINE STUDI KASUS KLASIFIKASI CURAH HUJAN DI JAKARTA

    Get PDF
    Abstrak. Penelitian ini difokuskan pada perbandingan beberapa fungsi kernel, cost dan proporsi data training pada Support Vector Machine terhadap akurasi pengklasifikasian curah hujan di Jakarta. Fungsi-fungsi kernel linier, Gauss dan polynomial digunakan untuk memodifikasi metode Support Vector Machine guna menyelesaikan kasus nonlinier yang sering terjadi pada kondisi real.  Variabel yang digunakan dalam penelitian ini meliputi temperatur, kelembaban, penyinaran matahari dan kecepatan angin. Hasil analisis menunjukkan bahwa nilai support vector terkecil tidak memberikan akurasi yang tertinggi pada masing-masing fungsi kernel. Selain itu, proporsi dataset (training:testing) sebesar  90%:10% memberikan akurasi sedikit lebih tinggi dibandingkan dengan akurasi untuk proporsi 80%:20% untuk masing-masing fungsi kernel. Secara keseluruhan, akurasi tertinggi diperoleh pada proporsi 90%:10% oleh fungsi kernel linier dan polinom untuk cost 1 dan 1000 secara bersamaan yaitu 78,38%.Kata Kunci : Cost, Gauss, Kernel, linear, polynomial, Abstract. This research focuses on the comparison of several kernel functions, costs and proportions of data training on the Support Vector Machine to the accuracy of classifying rainfall in Jakarta. The linear, Gaussian and polynomial kernel functions were applied to modify the Support Vector Machine method to solve non-linear cases that often occur in actual conditions. The variables used in this study comprised of temperature, humidity, sunlight and wind speed. The analysis disclosed that the smallest support vector value did not provide the highest accuracy value for each kernel. In addition, the proportion of the dataset (training:testing) of 90%:10% provided a slightly higher accuracy compared to the accuracy for the proportion of 80%:20% for each kernel function. Overall, the highest accuracy attained at the proportion of 90%:10% by linear and polynomial kernel functions for cost 1 and 1000 simultaneously, which was 78.38%

    Klasifikasi Wilayah Desa-perdesaan Dan Desa-perkotaan Wilayah Kabupaten Semarang Dengan Support Vector Machine (Svm)

    Full text link
    This research will be carry out classification based on the status of the rural and urban regions that reflect the differences in characteristics/ conditions between regions in Indonesia with Support Vector Machine (SVM) method. Classification on this issue is working by build separation functions involving the kernel function to map the input data into a higher dimensional space. Sequential Minimal Optimization (SMO) algorithms is used in the training process of data classification of rural and urban regions to get the optimal separation function (hyperplane). To determine the kernel function and parameters according to the data, grid search method combined with the leave-one-out cross-validation method is used. In the classification using SVM, accuracy is obtained, which the best value is 90% using Radial Basis Function (RBF) kernel functions with parameters C=100 dan γ=2-5

    KLASIFIKASI WILAYAH DESA-PERDESAAN DAN DESA-PERKOTAAN WILAYAH KABUPATEN SEMARANG DENGAN SUPPORT VECTOR MACHINE (SVM)

    Get PDF
    This research will be carry out classification based on the status of the rural and urban regions that reflect the differences in characteristics/ conditions between regions in Indonesia with Support Vector Machine (SVM) method. Classification on this issue is working by build separation functions involving the kernel function to map the input data into a higher dimensional space. Sequential Minimal Optimization (SMO) algorithms is used in the training process of data classification of rural and urban regions to get the optimal separation function (hyperplane). To determine the kernel function and parameters according to the data, grid search method combined with the leave-one-out cross-validation method is used. In the classification using SVM, accuracy is obtained, which the best value is 90% using Radial Basis Function (RBF) kernel functions with parameters C=100 dan γ=2-5. Keywords : classification, support vector machine, sequential minimal optimization, grid search, leave-one-out, cross validation, rural, urba

    Prediction of protein binding sites in protein structures using hidden Markov support vector machine

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance.</p> <p>Results</p> <p>In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods.</p> <p>Conclusion</p> <p>The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.</p

    Hypoglycaemia detection for type 1 diabetic patients based on ECG parameters using Fuzzy Support Vector Machine

    Full text link
    Nocturnal hypoglycaemia in type 1 diabetic patients can be dangerous in which symptoms may not be apparent while blood glucose level decreases to very low level, and for this reason, an effective detection system for hypoglycaemia is crucial. This research work proposes a detection system for the hypoglycaemia based on the classification of electrocardiographic (ECG) parameters. The classification uses a Fuzzy Support Vector Machine (FSVM) with inputs of heart rate, corrected QT (QTc) interval and corrected TpTe (TpTe c) interval. Three types of kernel functions (radial basis function (RBF), exponential radial basis function (ERBF) and polynomial function) are investigated in the classification. Moreover, parameters of the kernel functions are tuned to find the optimum of the classification. The results show that the FSVM classification using RBF kernel function demonstrates better performance than using SVM. However, both classifiers result approximately same performance if ERBF and polynomial kernel functions are used. © 2010 IEEE

    MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry

    Get PDF
    Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%)
    corecore