167 research outputs found
A Novel Fuzzy c -Means Clustering Algorithm Using Adaptive Norm
Abstract(#br)The fuzzy c -means (FCM) clustering algorithm is an unsupervised learning method that has been widely applied to cluster unlabeled data automatically instead of artificially, but is sensitive to noisy observations due to its inappropriate treatment of noise in the data. In this paper, a novel method considering noise intelligently based on the existing FCM approach, called adaptive-FCM and its extended version (adaptive-REFCM) in combination with relative entropy, are proposed. Adaptive-FCM, relying on an inventive integration of the adaptive norm, benefits from a robust overall structure. Adaptive-REFCM further integrates the properties of the relative entropy and normalized distance to preserve the global details of the dataset. Several experiments are carried out,..
Three-way Imbalanced Learning based on Fuzzy Twin SVM
Three-way decision (3WD) is a powerful tool for granular computing to deal
with uncertain data, commonly used in information systems, decision-making, and
medical care. Three-way decision gets much research in traditional rough set
models. However, three-way decision is rarely combined with the currently
popular field of machine learning to expand its research. In this paper,
three-way decision is connected with SVM, a standard binary classification
model in machine learning, for solving imbalanced classification problems that
SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy
twin support vector machine with three-way membership (TWFTSVM) are proposed.
The new three-way fuzzy membership function is defined to increase the
certainty of uncertain data in both input space and feature space, which
assigns higher fuzzy membership to minority samples compared with majority
samples. To evaluate the effectiveness of the proposed model, comparative
experiments are designed for forty-seven different datasets with varying
imbalance ratios. In addition, datasets with different imbalance ratios are
derived from the same dataset to further assess the proposed model's
performance. The results show that the proposed model significantly outperforms
other traditional SVM-based methods
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CEC-IB, that
models data with a set of Gaussian distributions and that retrieves clusters
based on a partial labeling provided by the user (partition-level side
information). By combining the ideas from cross-entropy clustering (CEC) with
those from the information bottleneck method (IB), our method trades between
three conflicting goals: the accuracy with which the data set is modeled, the
simplicity of the model, and the consistency of the clustering with side
information. Experiments demonstrate that CEC-IB has a performance comparable
to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
is faster, more robust to noisy labels, automatically determines the optimal
number of clusters, and performs well when not all classes are present in the
side information. Moreover, in contrast to other semi-supervised models, it can
be successfully applied in discovering natural subgroups if the partition-level
side information is derived from the top levels of a hierarchical clustering
Support vector machines to detect physiological patterns for EEG and EMG-based human-computer interaction:a review
Support vector machines (SVMs) are widely used classifiers for detecting physiological patterns in human-computer interaction (HCI). Their success is due to their versatility, robustness and large availability of free dedicated toolboxes. Frequently in the literature, insufficient details about the SVM implementation and/or parameters selection are reported, making it impossible to reproduce study analysis and results. In order to perform an optimized classification and report a proper description of the results, it is necessary to have a comprehensive critical overview of the applications of SVM. The aim of this paper is to provide a review of the usage of SVM in the determination of brain and muscle patterns for HCI, by focusing on electroencephalography (EEG) and electromyography (EMG) techniques. In particular, an overview of the basic principles of SVM theory is outlined, together with a description of several relevant literature implementations. Furthermore, details concerning reviewed papers are listed in tables and statistics of SVM use in the literature are presented. Suitability of SVM for HCI is discussed and critical comparisons with other classifiers are reported
UNFIS: A Novel Neuro-Fuzzy Inference System with Unstructured Fuzzy Rules for Classification
An important constraint of Fuzzy Inference Systems (FIS) is their structured
rules defined based on evaluating all input variables. Indeed, the length of
all fuzzy rules and the number of input variables are equal. However, in many
decision-making problems evaluating some conditions on a limited set of input
variables is sufficient to decide properly (unstructured rules). Therefore,
this constraint limits the performance, generalization, and interpretability of
the FIS. To address this issue, this paper presents a neuro-fuzzy inference
system for classification applications that can select different sets of input
variables for constructing each fuzzy rule. To realize this capability, a new
fuzzy selector neuron with an adaptive parameter is proposed that can select
input variables in the antecedent part of each fuzzy rule. Moreover, in this
paper, the consequent part of the Takagi-Sugeno-Kang FIS is also changed
properly to consider only the selected set of input variables. To learn the
parameters of the proposed architecture, a trust-region-based learning method
(General quasi-Levenberg-Marquardt (GqLM)) is proposed to minimize
cross-entropy in multiclass problems. The performance of the proposed method is
compared with some related previous approaches in some real-world
classification problems. Based on these comparisons the proposed method has
better or very close performance with a parsimonious structure consisting of
unstructured fuzzy
Disease diagnosis in smart healthcare: Innovation, technologies and applications
To promote sustainable development, the smart city implies a global vision that merges artificial intelligence, big data, decision making, information and communication technology (ICT), and the internet-of-things (IoT). The ageing issue is an aspect that researchers, companies and government should devote efforts in developing smart healthcare innovative technology and applications. In this paper, the topic of disease diagnosis in smart healthcare is reviewed. Typical emerging optimization algorithms and machine learning algorithms are summarized. Evolutionary optimization, stochastic optimization and combinatorial optimization are covered. Owning to the fact that there are plenty of applications in healthcare, four applications in the field of diseases diagnosis (which also list in the top 10 causes of global death in 2015), namely cardiovascular diseases, diabetes mellitus, Alzheimer’s disease and other forms of dementia, and tuberculosis, are considered. In addition, challenges in the deployment of disease diagnosis in healthcare have been discussed
Spectral-spatial approaches for hyperspectral data classification
Classification of hyperspectral data is very challenging and mapping of land cover is one of
its applications. Improving the classification accuracy and computation time of hyperspectral
data were achieved incorporating contextual information in combination with spectral information for correcting classification errors along class boundaries and within class. In
the proposed method, the original hyperspectral image was first classified using the Support
Vector Machine (SVM) classifier, followed by the Markov Random Field (MRF) approach
applied to the boundary areas and Unsupervised Extraction and Classification of Homogeneous Objects (UnECHO) classifier used for the interior parts of regions to produce the final classification map. In this study two agricultural (Hyperion and AVIRIS) and one
urban (ROSIS) datasets were used. Investigations of the spectral and various contextual
approaches including feature reduction show that the SVM-MRF method with grid search
works best for all of the datasets. The highest overall accuracy of 97.35% was achieved for
the urban dataset.Natural Sciences and Engineering Research Council of Canada (NSERC) and the University of Lethbridge
A survey on online active learning
Online active learning is a paradigm in machine learning that aims to select
the most informative data points to label from a data stream. The problem of
minimizing the cost associated with collecting labeled observations has gained
a lot of attention in recent years, particularly in real-world applications
where data is only available in an unlabeled form. Annotating each observation
can be time-consuming and costly, making it difficult to obtain large amounts
of labeled data. To overcome this issue, many active learning strategies have
been proposed in the last decades, aiming to select the most informative
observations for labeling in order to improve the performance of machine
learning models. These approaches can be broadly divided into two categories:
static pool-based and stream-based active learning. Pool-based active learning
involves selecting a subset of observations from a closed pool of unlabeled
data, and it has been the focus of many surveys and literature reviews.
However, the growing availability of data streams has led to an increase in the
number of approaches that focus on online active learning, which involves
continuously selecting and labeling observations as they arrive in a stream.
This work aims to provide an overview of the most recently proposed approaches
for selecting the most informative observations from data streams in the
context of online active learning. We review the various techniques that have
been proposed and discuss their strengths and limitations, as well as the
challenges and opportunities that exist in this area of research. Our review
aims to provide a comprehensive and up-to-date overview of the field and to
highlight directions for future work
- …