8,197 research outputs found
Quality, Frequency and Similarity Based Fuzzy Nearest Neighbor Classification
This paper proposes an approach based on fuzzy rough set theory to improve nearest neighbor based classification. Six measures are introduced to evaluate the quality of the nearest neighbors. This quality is combined with the frequency at which classes occur among the nearest neighbors and the similarity w.r.t. the nearest neighbor, to decide which class to pick among the neighbor's classes. The importance of each aspect is weighted using optimized weights. An experimental study shows that our method, Quality, Frequency and Similarity based Fuzzy Nearest Neighbor (QFSNN), outperforms state-of-the-art nearest neighbor classifiers
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
Adaptive kNN using Expected Accuracy for Classification of Geo-Spatial Data
The k-Nearest Neighbor (kNN) classification approach is conceptually simple -
yet widely applied since it often performs well in practical applications.
However, using a global constant k does not always provide an optimal solution,
e.g., for datasets with an irregular density distribution of data points. This
paper proposes an adaptive kNN classifier where k is chosen dynamically for
each instance (point) to be classified, such that the expected accuracy of
classification is maximized. We define the expected accuracy as the accuracy of
a set of structurally similar observations. An arbitrary similarity function
can be used to find these observations. We introduce and evaluate different
similarity functions. For the evaluation, we use five different classification
tasks based on geo-spatial data. Each classification task consists of (tens of)
thousands of items. We demonstrate, that the presented expected accuracy
measures can be a good estimator for kNN performance, and the proposed adaptive
kNN classifier outperforms common kNN and previously introduced adaptive kNN
algorithms. Also, we show that the range of considered k can be significantly
reduced to speed up the algorithm without negative influence on classification
accuracy
Towards the text compression based feature extraction in high impedance fault detection
High impedance faults of medium voltage overhead lines with covered conductors can be identified by the presence of partial discharges. Despite it is a subject of research for more than 60 years, online partial discharges detection is always a challenge, especially in environment with heavy background noise. In this paper, a new approach for partial discharge pattern recognition is presented. All results were obtained on data, acquired from real 22 kV medium voltage overhead power line with covered conductors. The proposed method is based on a text compression algorithm and it serves as a signal similarity estimation, applied for the first time on partial discharge pattern. Its relevancy is examined by three different variations of classification model. The improvement gained on an already deployed model proves its quality.Web of Science1211art. no. 214
Observer-biased bearing condition monitoring: from fault detection to multi-fault classification
Bearings are simultaneously a fundamental component and one of the principal causes of failure in rotary machinery. The work focuses on the employment of fuzzy clustering for bearing condition monitoring, i.e., fault detection and classification. The output of a clustering algorithm is a data partition (a set of clusters) which is merely a hypothesis on the structure of the data. This hypothesis requires validation by domain experts. In general, clustering algorithms allow a limited usage of domain knowledge on the cluster formation process. In this study, a novel method allowing for interactive clustering in bearing fault diagnosis is proposed. The method resorts to shrinkage to generalize an otherwise unbiased clustering algorithm into a biased one. In this way, the method provides a natural and intuitive way to control the cluster formation process, allowing for the employment of domain knowledge to guiding it. The domain expert can select a desirable level of granularity ranging from fault detection to classification of a variable number of faults and can select a specific region of the feature space for detailed analysis. Moreover, experimental results under realistic conditions show that the adopted algorithm outperforms the corresponding unbiased algorithm (fuzzy c-means) which is being widely used in this type of problems. (C) 2016 Elsevier Ltd. All rights reserved.Grant number: 145602
Mahalanobis Distance for Class Averaging of Cryo-EM Images
Single particle reconstruction (SPR) from cryo-electron microscopy (EM) is a
technique in which the 3D structure of a molecule needs to be determined from
its contrast transfer function (CTF) affected, noisy 2D projection images taken
at unknown viewing directions. One of the main challenges in cryo-EM is the
typically low signal to noise ratio (SNR) of the acquired images. 2D
classification of images, followed by class averaging, improves the SNR of the
resulting averages, and is used for selecting particles from micrographs and
for inspecting the particle images. We introduce a new affinity measure, akin
to the Mahalanobis distance, to compare cryo-EM images belonging to different
defocus groups. The new similarity measure is employed to detect similar
images, thereby leading to an improved algorithm for class averaging. We
evaluate the performance of the proposed class averaging procedure on synthetic
datasets, obtaining state of the art classification.Comment: Final version accepted to the 14th IEEE International Symposium on
Biomedical Imaging (ISBI 2017
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Shared Nearest-Neighbor Quantum Game-Based Attribute Reduction with Hierarchical Coevolutionary Spark and Its Application in Consistent Segmentation of Neonatal Cerebral Cortical Surfaces
© 2012 IEEE. The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces
Development of Signal Segmentation Technique and Improved Fuzzy K Nearest Centroid Neighbor (Ifkncn) Classifier for Audio Identification System
abstract is not available
- …