10 research outputs found

    Classification of infectious diseases via hybrid k-means clustering technique

    Get PDF
    Identifying groups of objects that are similar to each other but different from individuals in other groups can be intellectually satisfying, profitable, or sometimes both. Kmeans clustering is one of the well known partitioning algorithms. But basic K-means method is insufficient to extract meaningful information and its output is very conscious to initial positions of cluster centers. In this paper, data of infectious diseases were analyzed with the hybrid K-means clustering technique. This method is developed to preprocess the dataset that will be used in the K-means clustering problems. Specifically, it performs K-means clustering on preprocessed dataset instead of raw dataset to remove the impact of irrelevant features and selection of good initial centers. The experimental results revealed that all the three water related diseases are grouped together in one cluster for both KGHK and FMCK data sets. They also show the high prevalence compared to airborne particle related diseases in the other group. The study concludes that K-means clustering method provides a suitable tool for assessing the level of infectious diseases

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    Computing Adaptive Feature Weights with PSO to Improve Android Malware Detection

    Get PDF
    © 2017 Yanping Xu et al. Android malware detection is a complex and crucial issue. In this paper, we propose a malware detection model using a support vector machine (SVM) method based on feature weights that are computed by information gain (IG) and particle swarm optimization (PSO) algorithms. The IG weights are evaluated based on the relevance between features and class labels, and the PSO weights are adaptively calculated to result in the best fitness (the performance of the SVM classification model). Moreover, to overcome the defects of basic PSO, we propose a new adaptive inertia weight method called fitness-based and chaotic adaptive inertia weight-PSO (FCAIW-PSO) that improves on basic PSO and is based on the fitness and a chaotic term. The goal is to assign suitable weights to the features to ensure the best Android malware detection performance. The results of experiments indicate that the IG weights and PSO weights both improve the performance of SVM and that the performance of the PSO weights is better than that of the IG weights

    Clustering of Steel Strip Sectional Profiles Based on Robust Adaptive Fuzzy Clustering Algorithm

    Get PDF
    In this paper, the intelligent techniques are applied to enhance the quality control precision in the steel strip cold rolling production. Firstly a new control scheme is proposed, establishing the classifier of the steel strip cross-sectional profiles is the core of the system. The fuzzy clustering algorithm is used to establish the classifier. Secondly, a novel fuzzy clustering algorithm is proposed and used in the real application. The results, under the comparisons with the results obtained by the conventional fuzzy clustering algorithm, show the new algorithm is robust and efficient and it can not only get better clustering prototypes, which are used as the classifier, but also easily and effectively detect the outliers; it does great help in improving the performances of the new system. Finally, it is pointed out that the new algorithm's efficiency is mainly due to the introduction of a set of adaptive operators which allow for treating the different influences of data objects on the clustering operations; and in nature, the new fuzzy algorithm is the generalized version of the existing fuzzy clustering algorithm

    Adaptive Initialization Method Based on Spatial Local Information for k

    Get PDF
    k-means algorithm is a widely used clustering algorithm in data mining and machine learning community. However, the initial guess of cluster centers affects the clustering result seriously, which means that improper initialization cannot lead to a desirous clustering result. How to choose suitable initial centers is an important research issue for k-means algorithm. In this paper, we propose an adaptive initialization framework based on spatial local information (AIF-SLI), which takes advantage of local density of data distribution. As it is difficult to estimate density correctly, we develop two approximate estimations: density by t-nearest neighborhoods (t-NN) and density by ϵ-neighborhoods (ϵ-Ball), leading to two implements of the proposed framework. Our empirical study on more than 20 datasets shows promising performance of the proposed framework and denotes that it has several advantages: (1) can find the reasonable candidates of initial centers effectively; (2) it can reduce the iterations of k-means’ methods significantly; (3) it is robust to outliers; and (4) it is easy to implement

    Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm

    No full text
    K-means is one of the most popular and widespread partitioning clustering algorithms due to its superior scalability and efficiency. Typically, the K-means algorithm treats all features fairly and sets weights of all features equally when evaluating dissimilarity. However, a meaningful clustering phenomenon often occurs in a subspace defined by a specific subset of all features. To address this issue, this paper proposes a novel feature weight self-adjustment (FWSA) mechanism embedded into K-means in order to improve the clustering quality of K-means. In the FWSA mechanism, finding feature weights is modeled as an optimization problem to simultaneously minimize the separations within clusters and maximize the separations between clusters. With this objective, the adjustment margin of a feature weight can be derived based on the importance of the feature to the clustering quality. At each iteration in K-means, all feature weights are adaptively updated by adding their respective adjustment margins. A number of synthetic and real data are experimented on to show the benefits of the proposed FWAS mechanism. In addition, when compared to a recent similar feature weighting work, the proposed mechanism illustrates several advantages in both the theoretical and experimental results.
    corecore