Search CORE

167 research outputs found

A Novel Fuzzy c -Means Clustering Algorithm Using Adaptive Norm

Author: Baihua Chen
Dexin Wang
Jinyan Pan
Yunlong Gao
Zhihao Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/11/2019
Field of study

Abstract(#br)The fuzzy c -means (FCM) clustering algorithm is an unsupervised learning method that has been widely applied to cluster unlabeled data automatically instead of artificially, but is sensitive to noisy observations due to its inappropriate treatment of noise in the data. In this paper, a novel method considering noise intelligently based on the existing FCM approach, called adaptive-FCM and its extended version (adaptive-REFCM) in combination with relative entropy, are proposed. Adaptive-FCM, relying on an inventive integration of the adaptive norm, benefits from a robust overall structure. Adaptive-REFCM further integrates the properties of the relative entropy and normalized distance to preserve the global details of the dataset. Several experiments are carried out,..

Xiamen University Institutional Repository

Three-way Imbalanced Learning based on Fuzzy Twin SVM

Author: Cai Mingjie
Cai Wanting
Li Qingguo
Liu Qiong
Publication venue
Publication date: 19/05/2023
Field of study

Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is connected with SVM, a standard binary classification model in machine learning, for solving imbalanced classification problems that SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy twin support vector machine with three-way membership (TWFTSVM) are proposed. The new three-way fuzzy membership function is defined to increase the certainty of uncertain data in both input space and feature space, which assigns higher fuzzy membership to minority samples compared with majority samples. To evaluate the effectiveness of the proposed model, comparative experiments are designed for forty-seven different datasets with varying imbalance ratios. In addition, datasets with different imbalance ratios are derived from the same dataset to further assess the proposed model's performance. The results show that the proposed model significantly outperforms other traditional SVM-based methods

arXiv.org e-Print Archive

Distance Metric Learning from Uncertain Side Information for Automated Photo Tagging

Author: HOI Steven C. H.
JIN Rong
WU Lei
YU Nenghai
ZHU Jianke
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2011
Field of study

Institutional Knowledge at Singapore Management University

Semi-supervised cross-entropy clustering with information bottleneck constraint

Author: Geiger Bernhard C.
Śmieja Marek
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting goals: the accuracy with which the data set is modeled, the simplicity of the model, and the consistency of the clustering with side information. Experiments demonstrate that CEC-IB has a performance comparable to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but is faster, more robust to noisy labels, automatically determines the optimal number of clusters, and performs well when not all classes are present in the side information. Moreover, in contrast to other semi-supervised models, it can be successfully applied in discovering natural subgroups if the partition-level side information is derived from the top levels of a hierarchical clustering

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Support vector machines to detect physiological patterns for EEG and EMG-based human-computer interaction:a review

Author: Bianchi L.
Cavrini F.
Quitadamo L.R.
Riillo F.
Saggio G.
Sbernini L.
Seri S.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2017
Field of study

Support vector machines (SVMs) are widely used classifiers for detecting physiological patterns in human-computer interaction (HCI). Their success is due to their versatility, robustness and large availability of free dedicated toolboxes. Frequently in the literature, insufficient details about the SVM implementation and/or parameters selection are reported, making it impossible to reproduce study analysis and results. In order to perform an optimized classification and report a proper description of the results, it is necessary to have a comprehensive critical overview of the applications of SVM. The aim of this paper is to provide a review of the usage of SVM in the determination of brain and muscle patterns for HCI, by focusing on electroencephalography (EEG) and electromyography (EMG) techniques. In particular, an overview of the basic principles of SVM theory is outlined, together with a description of several relevant literature implementations. Furthermore, details concerning reviewed papers are listed in tables and statistics of SVM use in the literature are presented. Suitability of SVM for HCI is discussed and critical comparisons with other classifiers are reported

Aston Publications Explorer

ART

UNFIS: A Novel Neuro-Fuzzy Inference System with Unstructured Fuzzy Rules for Classification

Author: Salimi-Badr Armin
Publication venue
Publication date: 28/10/2022
Field of study

An important constraint of Fuzzy Inference Systems (FIS) is their structured rules defined based on evaluating all input variables. Indeed, the length of all fuzzy rules and the number of input variables are equal. However, in many decision-making problems evaluating some conditions on a limited set of input variables is sufficient to decide properly (unstructured rules). Therefore, this constraint limits the performance, generalization, and interpretability of the FIS. To address this issue, this paper presents a neuro-fuzzy inference system for classification applications that can select different sets of input variables for constructing each fuzzy rule. To realize this capability, a new fuzzy selector neuron with an adaptive parameter is proposed that can select input variables in the antecedent part of each fuzzy rule. Moreover, in this paper, the consequent part of the Takagi-Sugeno-Kang FIS is also changed properly to consider only the selected set of input variables. To learn the parameters of the proposed architecture, a trust-region-based learning method (General quasi-Levenberg-Marquardt (GqLM)) is proposed to minimize cross-entropy in multiclass problems. The performance of the proposed method is compared with some related previous approaches in some real-world classification problems. Based on these comparisons the proposed method has better or very close performance with a parsimonious structure consisting of unstructured fuzzy

arXiv.org e-Print Archive

Disease diagnosis in smart healthcare: Innovation, technologies and applications

Author: Alhalabi W.
Chui K. T.
Liu R. W.
Ordóñez de Pablos Patricia
Pang S. S. H.
Zhao M.
Publication venue
Publication date: 01/12/2017
Field of study

To promote sustainable development, the smart city implies a global vision that merges artificial intelligence, big data, decision making, information and communication technology (ICT), and the internet-of-things (IoT). The ageing issue is an aspect that researchers, companies and government should devote efforts in developing smart healthcare innovative technology and applications. In this paper, the topic of disease diagnosis in smart healthcare is reviewed. Typical emerging optimization algorithms and machine learning algorithms are summarized. Evolutionary optimization, stochastic optimization and combinatorial optimization are covered. Owning to the fact that there are plenty of applications in healthcare, four applications in the field of diseases diagnosis (which also list in the top 10 causes of global death in 2015), namely cardiovascular diseases, diabetes mellitus, Alzheimer’s disease and other forms of dementia, and tuberculosis, are considered. In addition, challenges in the deployment of disease diagnosis in healthcare have been discussed

Multidisciplinary Digital Publishing Institute

Repositorio Institucional de la Universidad de Oviedo

Directory of Open Access Journals

Spectral-spatial approaches for hyperspectral data classification

Author: Roy Sathi
University of Lethbridge. Faculty of Arts and Science
Publication venue: 'University of Central Missouri, Department of Mathematics and Computer Science'
Publication date: 01/01/2014
Field of study

Classification of hyperspectral data is very challenging and mapping of land cover is one of its applications. Improving the classification accuracy and computation time of hyperspectral data were achieved incorporating contextual information in combination with spectral information for correcting classification errors along class boundaries and within class. In the proposed method, the original hyperspectral image was first classified using the Support Vector Machine (SVM) classifier, followed by the Markov Random Field (MRF) approach applied to the boundary areas and Unsupervised Extraction and Classification of Homogeneous Objects (UnECHO) classifier used for the interior parts of regions to produce the final classification map. In this study two agricultural (Hyperion and AVIRIS) and one urban (ROSIS) datasets were used. Investigations of the spectral and various contextual approaches including feature reduction show that the SVM-MRF method with grid search works best for all of the datasets. The highest overall accuracy of 97.35% was achieved for the urban dataset.Natural Sciences and Engineering Research Council of Canada (NSERC) and the University of Lethbridge

OPUS: Open Uleth Scholarship - University of Lethbridge Research Repository

A survey on online active learning

Author: Cacciarelli Davide
Kulahci Murat
Publication venue
Publication date: 14/03/2023
Field of study

Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work

arXiv.org e-Print Archive