39,738 research outputs found

    Clustering Based on Classification Quality

    Get PDF
    Clustering a set of objects into homogeneous classes is a fundamental operation in data mining. Categorical data clustering based on rough set theory has been an active research area in the field of machine learning. However, pure rough set theory is not well suited for analyzing noisy information systems. In this paper, an alternative technique for categorical data clustering using Variable Precision Rough Set model is proposed. It is based on the classification quality of Variable Precision Rough theory. The technique is implemented in MATLAB. Experimental results on three benchmark UCI datasets indicate that the technique can be successfully used to analyze grouped categorical data because it produces better clustering results. Keywords : Clustering; Rough set; Variable precision rough set model, classification qualit

    Variable precision rough set model for attribute selection on environment impact dataset

    Get PDF
    The investigation of environment impact have important role to development of a city. The application of the artificial intelligence in form of computational models can be used to analyze the data. One of them is rough set theory. The utilization of data clustering method, which is a part of rough set theory, could provide a meaningful contribution on the decision making process. The application of this method could come in term of selecting the attribute of environment impact. This paper examine the application of variable precision rough set model for selecting attribute of environment impact. This mean of minimum error classification based approach is applied to a survey dataset by utilizing variable precision of attributes. This paper demonstrates the utilization of variable precision rough set model to select the most important impact of regional development. Based on the experiment, The availability of public open space, social organization and culture, migration and rate of employment are selected as a dominant attributes. It can be contributed on the policy design process, in term of formulating a proper intervention for enhancing the quality of social environment

    Location optimization of biodiesel processing plant based on rough set and clustering algorithm - a case study in China

    Get PDF
    Biofuel has an important role in alleviating the environmental pollution problem. More attention has been paid to optimization of biofuel supply chain in recent years. In this paper, a scientific, rational and practical biodiesel processing plant location with waste oil as the raw material was proposed in order to provide a theoretical basis for guiding the planning and management of restaurants, waste oil collection points, and processing plants. Considering the merits and demerits of the subjective and objective weighting methods, this paper proposes a new weighting method which is namely the combination of rough set theory and clustering algorithm. It then verifies the location results with a plant carbon emission. At last, this paper analyzes the location of biodiesel processing plant in the Yangtze River Delta of China and finds that the precision has been greatly improved with the new method comparing the RMSE and the R2 of the Delphi method with the improved rough set theory. By using this method, the weights of the influencing factors of biodiesel processing plants are the following: Waste oil supply 0.143, Fixed construction cost factor 0.343, Biodiesel demand 0.143 and Location convenience 0.371. In the comparison between the robust optimization method and the improved rough set theory, it was found that the final location results are the same, all being Jiaxing City. However, the improved rough set theory is much simpler than the robust optimization algorithm in the calculation process

    A Short Survey on Data Clustering Algorithms

    Full text link
    With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end
    • ā€¦
    corecore