33 research outputs found

    A novel artificial bee colony based clustering algorithm for categorical data

    Get PDF
    Funding: This work was supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. (21127010, 61202309, http://www.nsfc.gov.cn/), China Postdoctoral Science Foundation under Grant No. 2013M530956 (http://res.chinapostdoctor.org.cn), the UK Economic & Social Research Council (ESRC): award reference: ES/M001628/1 (http://www.esrc.ac.uk/), Science and Technology Development Plan of Jilin province under Grant No. 20140520068JH (http://www.jlkjt.gov.cn), Fundamental Research Funds for the Central Universities under No. 14QNJJ028 (http://www.nenu.edu.cn), the open project program of Key Laboratory of Symbolic Computation andKnowledge Engineering of Ministry of Education, Jilin University under Grant No. 93K172014K07 (http://www.jlu.edu.cn). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Peer reviewedPublisher PD

    Clustering Mixed Numeric and Categorical Data with Cuckoo Search

    Get PDF

    An initialization method for clustering mixed numeric and categorical data based on the density and distance

    No full text
    Most of the initialization approaches are dedicated to the partitional clustering algorithms which process categorical or numerical data only. However, in real-world applications, data objects with both numeric and categorical features are ubiquitous. The coexistence of both categorical and numerical attributes make the initialization methods designed for single-type data inapplicable to mixed-type data. Furthermore, to the best of our knowledge, in the existing partitional clustering algorithms designed for mixed-type data, the initial cluster centers are determined randomly. In this paper, we propose a novel initialization method for mixed data clustering. In the proposed method, both the distance and density are exploited together to determine initial cluster centers. The performance of the proposed method is demonstrated by a series of experiments on three real-world datasets in comparison with that of traditional initialization methods. </jats:p

    A Novel Cluster Center Initialization Method for the k-Prototypes Algorithms using Centrality and Distance

    No full text
    The k-prototypes algorithms are well known for their efficiency to cluster mixed numeric and categorical data. In kprototypes type algorithms the initial cluster centers are often determined in a random manner. It is acknowledged that the initial placement of cluster centers has a direct impact on the performance of the k-prototypes algorithms. However, most of the existing initialization approaches are designed for the k-means or k-modes algorithms, which can only deal with either pure numeric or categorical data, but not the mixture of both. In this paper, we propose a novel cluster center initialization method for the k-prototypes algorithms to address this issue. In the proposed method, the centrality of data objects is introduced based on the concept of neighborset, and then both the centrality and distance are exploited together to determine initial cluster centers. The performance of the proposed method is demonstrated by a series of experiments in comparison with that of traditional random initialization method

    A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data

    No full text

    A novel cluster center initialization method for the k-prototypes algorithms using centrality and distance

    No full text
    The k-prototypes algorithms are well known for their efficiency to cluster mixed numeric and categorical data. In kprototypes type algorithms the initial cluster centers are often determined in a random manner. It is acknowledged that the initial placement of cluster centers has a direct impact on the performance of the k-prototypes algorithms. However, most of the existing initialization approaches are designed for the k-means or k-modes algorithms, which can only deal with either pure numeric or categorical data, but not the mixture of both. In this paper, we propose a novel cluster center initialization method for the k-prototypes algorithms to address this issue. In the proposed method, the centrality of data objects is introduced based on the concept of neighborset, and then both the centrality and distance are exploited together to determine initial cluster centers. The performance of the proposed method is demonstrated by a series of experiments in comparison with that of traditional random initialization method

    DeepMRMP: A new predictor for multiple types of RNA modification sites using deep learning

    No full text

    Optimization of Micellar Electrokinetic Chromatography Method for the Simultaneous Determination of Seven Hydrophilic and Four Lipophilic Bioactive Components in Three Salvia Species

    No full text
    A micellar electrokinetic chromatography (MEKC) method was developed for the simultaneous determination of seven hydrophilic phenolic acids and four lipophilic tanshinones in three Salvia species. In normal MEKC mode using SDS as surfactant, the investigated 11 compounds could not be well separated. Therefore, several buffer modifiers including β-cyclodextrins (β-CD), ionic liquid 1-butyl-3-methylimidazolium tetrafluoroborate ([bmim]BF4) and organic solvents have been added to the buffer solution to improve the separation selectivity. Under the optimized conditions (BGE, 15 mM sodium tetraborate with 10 mM SDS, 5 mM β-CD, 10 mM [bmim]BF4 and 15% ACN (v/v) as additives; buffer pH, 9.8; voltage, 20 kV; temperature, 25 °C), the 11 investigated analytes could achieve baseline separation in 34 min. The proposed MEKC was additionally validated by evaluating the linearity (R2 ≥ 0.9965), LODs (0.27–1.39 μg·mL–1), and recovery (94.26%–105.17%), demonstrating this method was reproducible, accurate and reliable. Moreover, the contents of the 11 compounds in three Salvia species, including S. miltiorrhiza, S. przewalskii and S. castanea were analyzed. The result showed that the established MEKC method was simple and practical for the simultaneous determination of the hydrophilic and lipophilic bioactive components in Salvia species, which could be used to effectively evaluate the quality of these valued medicinal plants

    The AC of the four algorithms on the Zoo dataset.

    No full text
    <p>The AC of the four algorithms on the Zoo dataset.</p
    corecore