19,570 research outputs found
Fuzzy Neural Network Models For Multispectral Image Analysis
Fuzzy neural networks (FNNs) provide a new approach for classification of multispectral data and to extract and optimize classification rules. Neural networks deal with issues on a numeric level, whereas fuzzy logic deals with them on a semantic or linguistic level. FNNs synthesize fuzzy logic and neural networks. Recently, there has been growing interest in the research community not only to understand how FNNs arrive at particular decisions but how to decode information stored in the form of connection strengths in the network. In this paper, we propose fuzzy neural network models for classification of pixels in multispectral images and to extract fuzzy classification rules. During the training phase, the connection strengths are updated. After training, classification rules are extracted by backtracking along the weighted paths through the FNN. The extracted rules are then optimized using a fuzzy associative memory (FAM) bank. The data mining system described above is useful in many practical applications such as mapping, monitoring and managing our planet’s resources and health, climate change impacts and assessments, environmental change detection and military reconnaissance
HYEI: A New Hybrid Evolutionary Imperialist Competitive Algorithm for Fuzzy Knowledge Discovery
In recent years, imperialist competitive algorithm (ICA), genetic algorithm (GA), and hybrid fuzzy classification systems have been successfully and effectively employed for classification tasks of data mining. Due to overcoming the gaps related to ineffectiveness of current algorithms for analysing high-dimension independent datasets, a new hybrid approach, named HYEI, is presented to discover generic rule-based systems in this paper. This proposed approach consists of three stages and combines an evolutionary-based fuzzy system with two ICA procedures to generate high-quality fuzzy-classification rules. Initially, the best feature subset is selected by using the embedded ICA feature selection, and then these features are used to generate basic fuzzy-classification rules. Finally, all rules are optimized by using an ICA algorithm to reduce their length or to eliminate some of them. The performance of HYEI has been evaluated by using several benchmark datasets from the UCI machine learning repository. The classification accuracy attained by the proposed algorithm has the highest classification accuracy in 6 out of the 7 dataset problems and is comparative to the classification accuracy of the 5 other test problems, as compared to the best results previously published
Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome
We evaluate a version of the recently-proposed classification system named
Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space
of sequences of generic objects. The ODSE system has been originally presented
as a classification system for patterns represented as labeled graphs. However,
since ODSE is founded on the dissimilarity space representation of the input
data, the classifier can be easily adapted to any input domain where it is
possible to define a meaningful dissimilarity measure. Here we demonstrate the
effectiveness of the ODSE classifier for sequences by considering an
application dealing with the recognition of the solubility degree of the
Escherichia coli proteome. Solubility, or analogously aggregation propensity,
is an important property of protein molecules, which is intimately related to
the mechanisms underlying the chemico-physical process of folding. Each protein
of our dataset is initially associated with a solubility degree and it is
represented as a sequence of symbols, denoting the 20 amino acid residues. The
herein obtained computational results, which we stress that have been achieved
with no context-dependent tuning of the ODSE system, confirm the validity and
generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
The need to prediscretize numeric attributes before they can be used in
association rule learning is a source of inefficiencies in the resulting
classifier. This paper describes several new rule tuning steps aiming to
recover information lost in the discretization of numeric (quantitative)
attributes, and a new rule pruning strategy, which further reduces the size of
the classification models. We demonstrate the effectiveness of the proposed
methods on postoptimization of models generated by three state-of-the-art
association rule classification algorithms: Classification based on
Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016),
and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from
the UCI repository show that the postoptimized models are consistently smaller
-- typically by about 50% -- and have better classification performance on most
datasets
Comparison of different classification algorithms for fault detection and fault isolation in complex systems
Due to the lack of sufficient results seen in literature, feature extraction and classification methods of hydraulic systems appears to be somewhat challenging. This paper compares the performance of three classifiers (namely linear support vector machine (SVM), distance-weighted k-nearest neighbor (WKNN), and decision tree (DT) using data from optimized and non-optimized sensor set solutions. The algorithms are trained with known data and then tested with unknown data for different scenarios characterizing faults with different degrees of severity. This investigation is based solely on a data-driven approach and relies on data sets that are taken from experiments on the fuel system. The system that is used throughout this study is a typical fuel delivery system consisting of standard components such as a filter, pump, valve, nozzle, pipes, and two tanks. Running representative tests on a fuel system are problematic because of the time, cost, and reproduction constraints involved in capturing any significant degradation. Simulating significant degradation requires running over a considerable period; this cannot be reproduced quickly and is costly
- …