968 research outputs found

    Two-Stage Fuzzy Multiple Kernel Learning Based on Hilbert-Schmidt Independence Criterion

    Full text link
    © 1993-2012 IEEE. Multiple kernel learning (MKL) is a principled approach to kernel combination and selection for a variety of learning tasks, such as classification, clustering, and dimensionality reduction. In this paper, we develop a novel fuzzy multiple kernel learning model based on the Hilbert-Schmidt independence criterion (HSIC) for classification, which we call HSIC-FMKL. In this model, we first propose an HSIC Lasso-based MKL formulation, which not only has a clear statistical interpretation that minimum redundant kernels with maximum dependence on output labels are found and combined, but also enables the global optimal solution to be computed efficiently by solving a Lasso optimization problem. Since the traditional support vector machine (SVM) is sensitive to outliers or noises in the dataset, fuzzy SVM (FSVM) is used to select the prediction hypothesis once the optimal kernel has been obtained. The main advantage of FSVM is that we can associate a fuzzy membership with each data point such that these data points can have different effects on the training of the learning machine. We propose a new fuzzy membership function using a heuristic strategy based on the HSIC. The proposed HSIC-FMKL is a two-stage kernel learning approach and the HSIC is applied in both stages. We perform extensive experiments on real-world datasets from the UCI benchmark repository and the application domain of computational biology which validate the superiority of the proposed model in terms of prediction accuracy

    Three-way Imbalanced Learning based on Fuzzy Twin SVM

    Full text link
    Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is connected with SVM, a standard binary classification model in machine learning, for solving imbalanced classification problems that SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy twin support vector machine with three-way membership (TWFTSVM) are proposed. The new three-way fuzzy membership function is defined to increase the certainty of uncertain data in both input space and feature space, which assigns higher fuzzy membership to minority samples compared with majority samples. To evaluate the effectiveness of the proposed model, comparative experiments are designed for forty-seven different datasets with varying imbalance ratios. In addition, datasets with different imbalance ratios are derived from the same dataset to further assess the proposed model's performance. The results show that the proposed model significantly outperforms other traditional SVM-based methods

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Fuzzy Support Vector Machine Using Function Linear Membership and Exponential with Mahanalobis Distance

    Get PDF
    Support vector machine (SVM) is one of effective biner classification technic with structural risk minimization (SRM) principle. SVM method is known as one of successful method in classification technic. But the real-life data problem lies in the occurrence of noise and outlier. Noise will create confusion for the SVM when the data is being processed. On this research, SVM is being developed by adding its fuzzy membership function to lessen the noise and outlier effect in data when trying to figure out the hyperplane solution. Distance calculation is also being considered while determining fuzzy value because it is a basic thing in determining the proximity between data elements, which in general is built depending on the distance between the point into the real class mass center. Fuzzy support vector machine (FSVM) uses Mahalanobis distances with the goal of finding the best hyperplane by separating data between defined classes. The data used will be going over trial for several dividing partition percentage transforming into training set and testing set. Although theoretically FSVM is able to overcome noise and outliers, the results show that the accuracy of FSVM, namely 0.017170689 and 0.018668421, is lower than the accuracy of the classical SVM method, which is 0.018838348. The existence of fuzzy membership function is extremely influential in deciding the best hyperplane. Based on that, determining the correct fuzzy membership is critical in FSVM problem

    A Novel Scheme for Accelerating Support Vector Clustering

    Get PDF
    Limited by two time-consuming steps, solving the optimization problem and labeling the data points with cluster labels, the support vector clustering (SVC) based algorithms, perform ineffectively in processing large datasets. This paper presents a novel scheme aimed at solving these two problems and accelerating the SVC. Firstly, an innovative definition of noise data points is proposed which can be applied in the design of noise elimination to reduce the size of a data set as well as to improve its separability without destroying the profile. Secondly, in the cluster labeling, a double centroids (DBC) labeling method, representing each cell of a cluster by the centroids of shape and density, is presented. This method is implemented towards accelerating this procedure and addressing the problem of labeling the original data set with irregular or imbalanced distribution. Compared with the state-of-the-art algorithms, the experimental results show that the proposed method significantly reduces the computational resources and improves the accuracy. Further analysis and experiments of semi-supervised cluster labeling confirm that the proposed DBC model is suitable for representing cells in clustering

    Optimization of Fuzzy Support Vector Machine (FSVM) Performance by Distance-Based Similarity Measure Classification

    Get PDF
    This research aims to determine the maximum or minimum value of a Fuzzy Support Vector Machine (FSVM) Algorithm using the optimization function. As opposed to FSVM, which is less effective on large and complex data because of its sensitivity to outliers and noise, SVM is considered an effective method of data classification. One of the techniques used to overcome this inefficiency is fuzzy logic, with its ability to select the right membership function, which significantly affects the effectiveness of the FSVM algorithm performance. This research was carried out using the Gaussian membership function and the Distance-Based Similarity Measurement consisting of the Euclidean, Manhattan, Chebyshev, and Minkowsky distance methods. Subsequently, the optimization of the FSVM classification process was determined using four proposed FSVM models and normal SVM as comparison references. The results showed that the method tends to eliminate the impact of noise and enhance classification accuracy effectively. FSVM provides the best and highest accuracy value of 94% at a penalty parameter value of 1000 using the Chebyshev distance matrix. Furthermore, the model proposed will be compared to the performance evaluation model in preliminary studies. The result further showed that using FSVM with a Chebyshev distance matrix and a Gaussian membership function provides a better performance evaluation value. Doi: 10.28991/HIJ-2021-02-04-02 Full Text: PD

    FCS-MBFLEACH: Designing an Energy-Aware Fault Detection System for Mobile Wireless Sensor Networks

    Get PDF
    Wireless sensor networks (WSNs) include large-scale sensor nodes that are densely distributed over a geographical region that is completely randomized for monitoring, identifying, and analyzing physical events. The crucial challenge in wireless sensor networks is the very high dependence of the sensor nodes on limited battery power to exchange information wirelessly as well as the non-rechargeable battery of the wireless sensor nodes, which makes the management and monitoring of these nodes in terms of abnormal changes very difficult. These anomalies appear under faults, including hardware, software, anomalies, and attacks by raiders, all of which affect the comprehensiveness of the data collected by wireless sensor networks. Hence, a crucial contraption should be taken to detect the early faults in the network, despite the limitations of the sensor nodes. Machine learning methods include solutions that can be used to detect the sensor node faults in the network. The purpose of this study is to use several classification methods to compute the fault detection accuracy with different densities under two scenarios in regions of interest such as MB-FLEACH, one-class support vector machine (SVM), fuzzy one-class, or a combination of SVM and FCS-MBFLEACH methods. It should be noted that in the study so far, no super cluster head (SCH) selection has been performed to detect node faults in the network. The simulation outcomes demonstrate that the FCS-MBFLEACH method has the best performance in terms of the accuracy of fault detection, false-positive rate (FPR), average remaining energy, and network lifetime compared to other classification methods

    A Support Vector Classifier Based on Vague Similarity Measure

    Get PDF
    Support vector machine (SVM) is a popular machine learning method for its high generalizaiton ability. How to find the adaptive kernel function is a key problem to SVM from theory to practical applications. This paper proposes a support vector classifer based on vague sigmoid kernel and its similarity measure. The proposed method uses the characteristic of vague set, and replaces the traditional inner product with vague similarity measure between training samples. The experimental results show that the proposed method can reduce the CPU time and maintain the classification accuracy

    SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fuzzy Cluster based Nearest Neighbor Classifier

    Get PDF
    In many diseases classification an accurate gene analysis is needed, for which selection of most informative genes is very important and it require a technique of decision in complex context of ambiguity. The traditional methods include for selecting most significant gene includes some of the statistical analysis namely 2-Sample-T-test (2STT), Entropy, Signal to Noise Ratio (SNR). This paper evaluates gene selection and classification on the basis of accurate gene selection using structured complex decision technique (SCDT) and classifies it using fuzzy cluster based nearest neighborclassifier (FC-NNC). The effectiveness of the proposed SCDT and FC-NNC is evaluated for leave one out cross validation metric(LOOCV) along with sensitivity, specificity, precision and F1-score with four different classifiers namely 1) Radial Basis Function (RBF), 2) Multi-layer perception(MLP), 3) Feed Forward(FF) and 4) Support vector machine(SVM) for three different datasets of DLBCL, Leukemia and Prostate tumor. The proposed SCDT &FC-NNC exhibits superior result for being considered more accurate decision mechanism
    corecore