3,528 research outputs found

    Clustering human perception of environment impact using Rough Set Theory

    Get PDF
    Rough set is a set theory which is have been applied in the many areas. One of them is in data mining. The utilization of feature selection and clustering methods, that are a part of data mining application, could contribute for decision support. This paper investigates the application of rough set theory to select attribute and cluster environment impact. The Maximum Dependency Attribute (MDA) and fuzzy partition based on indiscernible relation are used to select the most important impact and cluster the object using the selected attributes, respectively. The data are collected from the field survey at identifying the environmental impact experienced by several communities in Yogyakarta, Indonesia. The results show that the water quality is the important attribute on physical and chemical aspects. Furthermore, on economic aspect, the highest attributes are immigration and employee absorption. Moreover, the number of cluster recommended is 9 based on the silhouette coefficient which is rising 0.9. This paper can be used to make recommendation to improve the quality of social environment

    Multiobjective optimization of cluster measures in Microarray Cancer data using Genetic Algorithm Based Fuzzy Clustering

    Get PDF
    The field of biological and biomedical research has been changed rapidly with the invention of microarray technology, which facilitates simultaneously monitoring of large number of genes across different experimental conditions. In this report a multi objective genetic algorithm technique called Non-Dominated Sorting Genetic Algorithm (NSGA) - II based approach has been proposed for fuzzy clustering of microarray cancer expression dataset that encodes the cluster modes and simultaneously optimizes the two factors called fuzzy compactness and fuzzy separation of the clusters. The multiobjective technique produces a set of non-dominated solutions. This approach identifies the solution i.e. the individual chromosome which gives the optimal value of the parameters

    Soft Set Theory for Data Reduction

    Get PDF
    The recent changes in utility structureso development in renewable technologies and increased. There are many data exist all stored data stored in the computer using intemet, everyday data was stored. This data poses a problem when we need to use data" but the data are too numerous and scattered on the internet blur of data. Therefore, there are techniques required and are introduced to overcome this problem. Discussion discussed is Knowledge Discovery in Databases and techniques used are multi-soft set of techniques. Dataset is a set of multi-value data. By using Multi soft sets irq can reduce the data based on the theory of soft sets

    Visualization and analytics of codicological data of Hebrew books

    Get PDF
    The goal is to provide a proper data model, using a common vocabulary, to decrease the heterogenous nature of these datasets as well as its inherent uncertainty caused by the descriptive nature of the field of Codicology. This research project was developed with the goal of applying data visualization and data mining techniques to the field of Codicology and Digital Humanities. Using Hebrew manuscript data as a starting point, this dissertation proposes an environment for exploratory analysis to be used by Humanities experts to deepen their understanding of codicological data, to formulate new, or verify existing, research hypotheses, and to communicate their findings in a richer way. To improve the scope of visualizations and knowledge discovery we will try to use data mining methods such as Association Rule Mining and Formal Concept Analysis. The present dissertation aims to retrieve information and structure from Hebrew manuscripts collected by codicologists. These manuscripts reflect the production of books of a specific region, namely "Sefarad" region, within the period between 10th and 16th.A presente dissertação tem como objetivo obter conhecimento estruturado de manuscritos hebraicos coletados por codicologistas. Estes manuscritos refletem a produção de livros de uma região específica, nomeadamente a região "Sefarad", no período entre os séculos X e XVI. O objetivo é fornecer um modelo de dados apropriado, usando um vocabulário comum, para diminuir a natureza heterogénea desses conjuntos de dados, bem como sua incerteza inerente causada pela natureza descritiva no campo da Codicologia. Este projeto de investigação foi desenvolvido com o objetivo de aplicar técnicas de visualização de dados e "data mining" no campo da Codicologia e Humanidades Digitais. Usando os dados de manuscritos hebraicos como ponto de partida, esta dissertação propõe um ambiente para análise exploratória a ser utilizado por especialistas em Humanidades Digitais e Codicologia para aprofundar a compreensão dos dados codicológicos, formular novas hipóteses de pesquisa, ou verificar existentes, e comunicar as suas descobertas de uma forma mais rica. Para melhorar as visualizações e descoberta de conhecimento, tentaremos usar métodos de data mining, como a "Association Rule Mining" e "Formal Concept Analysis"

    Networks of Emotion Concepts

    Get PDF
    The aim of this work was to study the similarity network and hierarchical clustering of Finnish emotion concepts. Native speakers of Finnish evaluated similarity between the 50 most frequently used Finnish words describing emotional experiences. We hypothesized that methods developed within network theory, such as identifying clusters and specific local network structures, can reveal structures that would be difficult to discover using traditional methods such as multidimensional scaling (MDS) and ordinary cluster analysis. The concepts divided into three main clusters, which can be described as negative, positive, and surprise. Negative and positive clusters divided further into meaningful sub-clusters, corresponding to those found in previous studies. Importantly, this method allowed the same concept to be a member in more than one cluster. Our results suggest that studying particular network structures that do not fit into a low-dimensional description can shed additional light on why subjects evaluate certain concepts as similar. To encourage the use of network methods in analyzing similarity data, we provide the analysis software for free use (http://www.becs.tkk.fi/similaritynets/)

    INTEGRATING KANO MODEL WITH DATA MINING TECHNIQUES TO ENHANCE CUSTOMER SATISFACTION

    Get PDF
    The business world is becoming more competitive from time to time; therefore, businesses are forced to improve their strategies in every single aspect. So, determining the elements that contribute to the clients\u27 contentment is one of the critical needs of businesses to develop successful products in the market. The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model focuses on highlighting the most relevant attributes of a product or service along with customers’ estimation of how these attributes can be used to predict satisfaction with specific services or products. This research aims at developing a method to integrate the Kano model and data mining approaches to select relevant attributes that drive customer satisfaction, with a specific focus on higher education. The significant contribution of this research is to improve the quality of United Arab Emirates University academic support and development services provided to their students by solving the problem of selecting features that are not methodically correlated to customer satisfaction, which could reduce the risk of investing in features that could ultimately be irrelevant to enhancing customer satisfaction. Questionnaire data were collected from 646 students from United Arab Emirates University. The experiment suggests that Extreme Gradient Boosting Regression can produce the best results for this kind of problem. Based on the integration of the Kano model and the feature selection method, the number of features used to predict customer satisfaction is minimized to four features. It was found that either Chi-Square or Analysis of Variance (ANOVA) features selection model’s integration with the Kano model giving higher values of Pearson correlation coefficient and R2. Moreover, the prediction was made using union features between the Kano model\u27s most important features and the most frequent features among 8 clusters. It shows high-performance results
    corecore