167 research outputs found

    A Novel Fuzzy c -Means Clustering Algorithm Using Adaptive Norm

    Get PDF
    Abstract(#br)The fuzzy c -means (FCM) clustering algorithm is an unsupervised learning method that has been widely applied to cluster unlabeled data automatically instead of artificially, but is sensitive to noisy observations due to its inappropriate treatment of noise in the data. In this paper, a novel method considering noise intelligently based on the existing FCM approach, called adaptive-FCM and its extended version (adaptive-REFCM) in combination with relative entropy, are proposed. Adaptive-FCM, relying on an inventive integration of the adaptive norm, benefits from a robust overall structure. Adaptive-REFCM further integrates the properties of the relative entropy and normalized distance to preserve the global details of the dataset. Several experiments are carried out,..

    Three-way Imbalanced Learning based on Fuzzy Twin SVM

    Full text link
    Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is connected with SVM, a standard binary classification model in machine learning, for solving imbalanced classification problems that SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy twin support vector machine with three-way membership (TWFTSVM) are proposed. The new three-way fuzzy membership function is defined to increase the certainty of uncertain data in both input space and feature space, which assigns higher fuzzy membership to minority samples compared with majority samples. To evaluate the effectiveness of the proposed model, comparative experiments are designed for forty-seven different datasets with varying imbalance ratios. In addition, datasets with different imbalance ratios are derived from the same dataset to further assess the proposed model's performance. The results show that the proposed model significantly outperforms other traditional SVM-based methods

    Semi-supervised cross-entropy clustering with information bottleneck constraint

    Full text link
    In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting goals: the accuracy with which the data set is modeled, the simplicity of the model, and the consistency of the clustering with side information. Experiments demonstrate that CEC-IB has a performance comparable to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but is faster, more robust to noisy labels, automatically determines the optimal number of clusters, and performs well when not all classes are present in the side information. Moreover, in contrast to other semi-supervised models, it can be successfully applied in discovering natural subgroups if the partition-level side information is derived from the top levels of a hierarchical clustering

    Support vector machines to detect physiological patterns for EEG and EMG-based human-computer interaction:a review

    Get PDF
    Support vector machines (SVMs) are widely used classifiers for detecting physiological patterns in human-computer interaction (HCI). Their success is due to their versatility, robustness and large availability of free dedicated toolboxes. Frequently in the literature, insufficient details about the SVM implementation and/or parameters selection are reported, making it impossible to reproduce study analysis and results. In order to perform an optimized classification and report a proper description of the results, it is necessary to have a comprehensive critical overview of the applications of SVM. The aim of this paper is to provide a review of the usage of SVM in the determination of brain and muscle patterns for HCI, by focusing on electroencephalography (EEG) and electromyography (EMG) techniques. In particular, an overview of the basic principles of SVM theory is outlined, together with a description of several relevant literature implementations. Furthermore, details concerning reviewed papers are listed in tables and statistics of SVM use in the literature are presented. Suitability of SVM for HCI is discussed and critical comparisons with other classifiers are reported

    UNFIS: A Novel Neuro-Fuzzy Inference System with Unstructured Fuzzy Rules for Classification

    Full text link
    An important constraint of Fuzzy Inference Systems (FIS) is their structured rules defined based on evaluating all input variables. Indeed, the length of all fuzzy rules and the number of input variables are equal. However, in many decision-making problems evaluating some conditions on a limited set of input variables is sufficient to decide properly (unstructured rules). Therefore, this constraint limits the performance, generalization, and interpretability of the FIS. To address this issue, this paper presents a neuro-fuzzy inference system for classification applications that can select different sets of input variables for constructing each fuzzy rule. To realize this capability, a new fuzzy selector neuron with an adaptive parameter is proposed that can select input variables in the antecedent part of each fuzzy rule. Moreover, in this paper, the consequent part of the Takagi-Sugeno-Kang FIS is also changed properly to consider only the selected set of input variables. To learn the parameters of the proposed architecture, a trust-region-based learning method (General quasi-Levenberg-Marquardt (GqLM)) is proposed to minimize cross-entropy in multiclass problems. The performance of the proposed method is compared with some related previous approaches in some real-world classification problems. Based on these comparisons the proposed method has better or very close performance with a parsimonious structure consisting of unstructured fuzzy

    Disease diagnosis in smart healthcare: Innovation, technologies and applications

    Get PDF
    To promote sustainable development, the smart city implies a global vision that merges artificial intelligence, big data, decision making, information and communication technology (ICT), and the internet-of-things (IoT). The ageing issue is an aspect that researchers, companies and government should devote efforts in developing smart healthcare innovative technology and applications. In this paper, the topic of disease diagnosis in smart healthcare is reviewed. Typical emerging optimization algorithms and machine learning algorithms are summarized. Evolutionary optimization, stochastic optimization and combinatorial optimization are covered. Owning to the fact that there are plenty of applications in healthcare, four applications in the field of diseases diagnosis (which also list in the top 10 causes of global death in 2015), namely cardiovascular diseases, diabetes mellitus, Alzheimer’s disease and other forms of dementia, and tuberculosis, are considered. In addition, challenges in the deployment of disease diagnosis in healthcare have been discussed

    Spectral-spatial approaches for hyperspectral data classification

    Get PDF
    Classification of hyperspectral data is very challenging and mapping of land cover is one of its applications. Improving the classification accuracy and computation time of hyperspectral data were achieved incorporating contextual information in combination with spectral information for correcting classification errors along class boundaries and within class. In the proposed method, the original hyperspectral image was first classified using the Support Vector Machine (SVM) classifier, followed by the Markov Random Field (MRF) approach applied to the boundary areas and Unsupervised Extraction and Classification of Homogeneous Objects (UnECHO) classifier used for the interior parts of regions to produce the final classification map. In this study two agricultural (Hyperion and AVIRIS) and one urban (ROSIS) datasets were used. Investigations of the spectral and various contextual approaches including feature reduction show that the SVM-MRF method with grid search works best for all of the datasets. The highest overall accuracy of 97.35% was achieved for the urban dataset.Natural Sciences and Engineering Research Council of Canada (NSERC) and the University of Lethbridge

    A survey on online active learning

    Full text link
    Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work
    • …
    corecore