299,449 research outputs found

    A Hybrid Approach Support Vector Machine (SVM) – Neuro Fuzzy for Fast Data Classification

    Full text link
    In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM – Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM

    A Hybrid Approach Support Vector Machine (SVM) – Neuro Fuzzy For Fast Data Classification

    Get PDF
    In recent decade, support vector machine (SVM) was a machine learning method that widely used in several application domains. It was due to SVM has a good performance for solving data classification problems, particularly in non-linear case. Nevertheless, several studies indicated that SVM still has some inadequacies, especially the high time complexity in testing phase that is caused by increasing the number of support vector for high dimensional data. To address this problem, we propose a hybrid approach SVM – Neuro Fuzzy (SVMNF), which neuro fuzzy here is used to avoid influence of support vector in testing phase of SVM. Moreover, our approach is also equipped with a feature selection that can reduce data attributes in testing phase, so that it can improve the effectiveness of time computation. Based on our evaluation in real benchmark datasets, our approach outperformed SVM in testing phase for solving data classification problems without significantly affecting the accuracy of SVM

    New Covariance-Based Feature Extraction Methods for Classification and Prediction of High-Dimensional Data

    Get PDF
    When analyzing high dimensional data sets, it is often necessary to implement feature extraction methods in order to capture relevant discriminating information useful for the purposes of classification and prediction. The relevant information can typically be represented in lower-dimensional feature spaces, and a widely used approach for this is the principal component analysis (PCA) method. PCA efficiently compresses information into lower dimensions; however, studies indicate that it is not optimal for feature extraction especially when dealing with classification problems. Furthermore, for high-dimensional data having limited observations, as is typically the case with remote sensing data and nonstationary data such as financial data, covariance matrix estimation becomes unreliable, and this adversely affects the representation of data in the PCA domain. In this thesis, we first introduce a new feature extraction method called summed component analysis (SCA), which makes use of the structure of eigenvectors of the common covariance matrix to generate new features as sums of certain original features. Secondly, we present a variation of SCA, known as class summed component analysis (CSCA). CSCA takes advantage of the relative ease of computing the class covariance matrices and uses them to determine data transformations. Since the new features consist of simple sums of the original features, we are able to gain a conceptual meaning of the new representation of the data which is appealing for man-machine interface. We evaluate these methods on data sets with varying sample sizes and on financial time series, and are able to show improved classification and prediction accuracies

    Exploiting synthetically generated data with semi-supervised learning for small and imbalanced datasets

    Get PDF
    Data augmentation is rapidly gaining attention in machine learning. Synthetic data can be generated by simple transformations or through the data distribution. In the latter case, the main challenge is to estimate the label associated to new synthetic patterns. This paper studies the effect of generating synthetic data by convex combination of patterns and the use of these as unsupervised information in a semi-supervised learning framework with support vector machines, avoiding thus the need to label synthetic examples. We perform experiments on a total of 53 binary classification datasets. Our results show that this type of data over-sampling supports the well-known cluster assumption in semi-supervised learning, showing outstanding results for small high-dimensional datasets and imbalanced learning problems
    • …
    corecore