98,267 research outputs found

    A Fuzzy Clustering Algorithm for High Dimensional Streaming Data

    Get PDF
    In this paper we propose a dimension reduced weighted fuzzy clustering algorithm (sWFCM-HD). The algorithm can be used for high dimensional datasets having streaming behavior. Such datasets can be found in the area of sensor networks, data originated from web click stream and data collected by internet traffic flow etc. These data’s have two special properties which separate them from other datasets: a) They have streaming behavior and b) They have higher dimensions. Optimized fuzzy clustering algorithm has already been proposed for datasets having streaming behavior or higher dimensions. But as per our information, nobody has proposed any optimized fuzzy clustering algorithm for data sets having both the properties, i.e., data sets with higher dimension and also continuously arriving streaming behavior. Experimental analysis shows that our proposed algorithm (sWFCM-HD) improves performance in terms of memory consumption as well as execution time Keywords-K-Means, Fuzzy C-Means, Weighted Fuzzy C-Means, Dimension Reduction, Clustering

    Fuzzy C-ordered medoids clustering of interval-valued data

    Get PDF
    Fuzzy clustering for interval-valued data helps us to find natural vague boundaries in such data. The Fuzzy c-Medoids Clustering (FcMdC) method is one of the most popular clustering methods based on a partitioning around medoids approach. However, one of the greatest disadvantages of this method is its sensitivity to the presence of outliers in data. This paper introduces a new robust fuzzy clustering method named Fuzzy c-Ordered-Medoids clustering for interval-valued data (FcOMdC-ID). The Huber's M-estimators and the Yager's Ordered Weighted Averaging (OWA) operators are used in the method proposed to make it robust to outliers. The described algorithm is compared with the fuzzy c-medoids method in the experiments performed on synthetic data with different types of outliers. A real application of the FcOMdC-ID is also provided

    Weighted-covariance factor fuzzy C-means clustering

    Get PDF
    In this paper, we propose a factor weighted fuzzy c-means clustering algorithm. Based on the inverse of a covariance factor, which assesses the collinearity between the centers and samples, this factor takes also into account the compactness of the samples within clusters. The proposed clustering algorithm allows to classify spherical and non-spherical structural clusters, contrary to classical fuzzy c-means algorithm that is only adapted for spherical structural clusters. Compared with other algorithms designed for non-spherical structural clusters, such as Gustafson-Kessel, Gath-Geva or adaptive Mahalanobis distance-based fuzzy c-means clustering algorithms, the proposed algorithm gives better numerical results on artificial and real well known data sets. Moreover, this algorithm can be used for high dimensional data, contrary to other algorithms that require the computation of determinants of large matrices. Application on Mid-Infrared spectra acquired on maize root and aerial parts of Miscanthus for the classification of vegetal biomass shows that this algorithm can successfully be applied on high dimensional data

    Comments on "Iteratively Re-weighted Algorithm for Fuzzy c-Means"

    Full text link
    In this comment, we present a simple alternate derivation to the IRW-FCM algorithm presented in "Iteratively Re-weighted Algorithm for Fuzzy c-Means" for Fuzzy c-Means problem. We show that the iterative steps derived for IRW-FCM algorithm are nothing but steps of the popular Majorization Minimization (MM) algorithm. The derivation presented in this note is much simpler and straightforward and, unlike the derivation of IRW-FCM, the derivation here does not involve introduction of any auxiliary variable. Moreover, by showing the steps of IRW-FCM as the MM algorithm, the inner loop of the IRW-FCM algorithm can be eliminated and the algorithm can be effectively run as a "single loop" algorithm. More precisely, the new MM-based derivation deduces that a single inner loop of IRW-FCM is sufficient to decrease the Fuzzy c-means objective function, which speeds up the IRW-FCM algorithm

    Circular Pythagorean fuzzy sets and applications to multi-criteria decision making

    Full text link
    In this paper, we introduce the concept of circular Pythagorean fuzzy set (value) (C-PFS(V)) as a new generalization of both circular intuitionistic fuzzy sets (C-IFSs) proposed by Atannassov and Pythagorean fuzzy sets (PFSs) proposed by Yager. A circular Pythagorean fuzzy set is represented by a circle that represents the membership degree and the non-membership degree and whose center consists of non-negative real numbers μ\mu and ν\nu with the condition μ2+ν21\mu^2+\nu^2\leq 1. A C-PFS models the fuzziness of the uncertain information more properly thanks to its structure that allows modelling the information with points of a circle of a certain center and a radius. Therefore, a C-PFS lets decision makers to evaluate objects in a larger and more flexible region and thus more sensitive decisions can be made. After defining the concept of C-PFS we define some fundamental set operations between C-PFSs and propose some algebraic operations between C-PFVs via general tt-norms and tt-conorms. By utilizing these algebraic operations, we introduce some weighted aggregation operators to transform input values represented by C-PFVs to a single output value. Then to determine the degree of similarity between C-PFVs we define a cosine similarity measure based on radius. Furthermore, we develop a method to transform a collection of Pythagorean fuzzy values to a PFS. Finally, a method is given to solve multi-criteria decision making problems in circular Pythagorean fuzzy environment and the proposed method is practiced to a problem about selecting the best photovoltaic cell from the literature. We also study the comparison analysis and time complexity of the proposed method

    Data mining using intelligent systems : an optimized weighted fuzzy decision tree approach

    Get PDF
    Data mining can be said to have the aim to analyze the observational datasets to find relationships and to present the data in ways that are both understandable and useful. In this thesis, some existing intelligent systems techniques such as Self-Organizing Map, Fuzzy C-means and decision tree are used to analyze several datasets. The techniques are used to provide flexible information processing capability for handling real-life situations. This thesis is concerned with the design, implementation, testing and application of these techniques to those datasets. The thesis also introduces a hybrid intelligent systems technique: Optimized Weighted Fuzzy Decision Tree (OWFDT) with the aim of improving Fuzzy Decision Trees (FDT) and solving practical problems. This thesis first proposes an optimized weighted fuzzy decision tree, incorporating the introduction of Fuzzy C-Means to fuzzify the input instances but keeping the expected labels crisp. This leads to a different output layer activation function and weight connection in the neural network (NN) structure obtained by mapping the FDT to the NN. A momentum term was also introduced into the learning process to train the weight connections to avoid oscillation or divergence. A new reasoning mechanism has been also proposed to combine the constructed tree with those weights which had been optimized in the learning process. This thesis also makes a comparison between the OWFDT and two benchmark algorithms, Fuzzy ID3 and weighted FDT. SIx datasets ranging from material science to medical and civil engineering were introduced as case study applications. These datasets involve classification of composite material failure mechanism, classification of electrocorticography (ECoG)/Electroencephalogram (EEG) signals, eye bacteria prediction and wave overtopping prediction. Different intelligent systems techniques were used to cluster the patterns and predict the classes although OWFDT was used to design classifiers for all the datasets. In the material dataset, Self-Organizing Map and Fuzzy C-Means were used to cluster the acoustic event signals and classify those events to different failure mechanism, after the classification, OWFDT was introduced to design a classifier in an attempt to classify acoustic event signals. For the eye bacteria dataset, we use the bagging technique to improve the classification accuracy of Multilayer Perceptrons and Decision Trees. Bootstrap aggregating (bagging) to Decision Tree also helped to select those most important sensors (features) so that the dimension of the data could be reduced. Those features which were most important were used to grow the OWFDT and the curse of dimensionality problem could be solved using this approach. The last dataset, which is concerned with wave overtopping, was used to benchmark OWFDT with some other Intelligent Systems techniques, such as Adaptive Neuro-Fuzzy Inference System (ANFIS), Evolving Fuzzy Neural Network (EFuNN), Genetic Neural Mathematical Method (GNMM) and Fuzzy ARTMAP. Through analyzing these datasets using these Intelligent Systems Techniques, it has been shown that patterns and classes can be found or can be classified through combining those techniques together. OWFDT has also demonstrated its efficiency and effectiveness as compared with a conventional fuzzy Decision Tree and weighted fuzzy Decision Tree

    Cluster Analysis Based on Bipartite Network

    Get PDF
    Clustering data has a wide range of applications and has attracted considerable attention in data mining and artificial intelligence. However it is difficult to find a set of clusters that best fits natural partitions without any class information. In this paper, a method for detecting the optimal cluster number is proposed. The optimal cluster number can be obtained by the proposal, while partitioning the data into clusters by FCM (Fuzzy c-means) algorithm. It overcomes the drawback of FCM algorithm which needs to define the cluster number c in advance. The method works by converting the fuzzy cluster result into a weighted bipartite network and then the optimal cluster number can be detected by the improved bipartite modularity. The experimental results on artificial and real data sets show the validity of the proposed method
    corecore