39,262 research outputs found

    Ассоциативные правила в интеллектуальном анализе данных

    Get PDF
    Рассмотрена задача построения моделей на основе ассоциативных правил. Проанализирован процесс поиска ассоциативных правил. Исследованы различные виды ассоциативных правил (негативные, численные, обобщенные, временные и нечеткие ассоциативные правила при использовании их для решения задач интеллектуального анализа данныхThe problem of synthesis of models based on association rules is concidered. The process of mining association rules is analyzed. Various types of association rules (negative, quantitative, generalized, temporal and fuzzy association rules) for solving data mining problems are investigate

    A fuzzy approach for mining quantitative association rules

    Get PDF
    During the last ten years, data mining, also known as knowledge discovery in databases, has established its position as a prominent and important research area. Mining association rules is one of the important research problems in data mining. Many algorithms have been proposed to find association rules in databases with quantitative attributes. The algorithms usually discretize the attribute domains into sharp intervals, and then apply simpler algorithms developed for boolean attributes. An example of a quantitative association rule might be "10% of married people between age 50 and 70 have at least 2 cars". Recently, fuzzy sets were suggested to represent intervals with non-sharp boundaries. Using the fuzzy concept, the above example could be rephrased e.g. "10% of married old people have several cars". However, if the fuzzy sets are not well chosen, anomalies may occur. In this paper we tackle this problem by introducing an additional fuzzy normalization process. Then we present the definition of quantitative association rules based on fuzzy set theory and propose a new algorithm for mining fuzzy association rules. The algorithm uses generalized definitions for interest measures. Experimental results show the efficiency of the algorithm for large databases

    Data mining and association rules to determine twitter trends

    Get PDF
    Opinion mining has been widely studied in the last decade due to its great interest in the field of research and countless real-world applications. This research proposes a system that combines association rules, generalization of rules, and sentiment analysis to catalog and discover opinion trends in Twitter [1]. The sentiment analysis is used to favor the generalization of the association rules. In this sense, an initial set of 1.6 million tweets captured in an undirected way is first summarized through text mining in an input set for the algorithms of rules and sentiment analysis of 158,354 tweets. On this last group, easily interpretable standard and generalized sets of rules are obtained about characters, which were revealed as an interesting result of the system

    Digging deep into weighted patient data through multiple-level patterns

    Get PDF
    Large data volumes have been collected by healthcare organizations at an unprecedented rate. Today both physicians and healthcare system managers are very interested in extracting value from such data. Nevertheless, the increasing data complexity and heterogeneity prompts the need for new efficient and effective data mining approaches to analyzing large patient datasets. Generalized association rule mining algorithms can be exploited to automatically extract hidden multiple-level associations among patient data items (e.g., examinations, drugs) from large datasets equipped with taxonomies. However, in current approaches all data items are assumed to be equally relevant within each transaction, even if this assumption is rarely true. This paper presents a new data mining environment targeted to patient data analysis. It tackles the issue of extracting generalized rules from weighted patient data, where items may weight differently according to their importance within each transaction. To this aim, it proposes a novel type of association rule, namely the Weighted Generalized Association Rule (W-GAR). The usefulness of the proposed pattern has been evaluated on real patient datasets equipped with a taxonomy built over examinations and drugs. The achieved results demonstrate the effectiveness of the proposed approach in mining interesting and actionable knowledge in a real medical care scenario

    User-Driven Pattern Mining on knowledge graphs: an Archaeological Case Study

    Get PDF
    In recent years, there has been a growing interest from the Digital Humanities in knowledge graphs as data modelling paradigm. Already, many data sets have been published as such and are available in the Linked Open Data cloud. With it, the nature of these data has shifted from unstructured to structured. This presents new opportunities for data mining. In this work, we investigate to what extend data mining can contribute to the understanding of archaeological knowledge, expressed as knowledge graph, and which form would best meet the communities' needs. A case study was held which involved the user-driven mining of generalized association rules. Experiments have shown that the approach yielded mostly plausible patterns, some of which were seen as highly relevant by domain experts

    Interestingness measure on privacy preserved data with horizontal partitioning

    Get PDF
    Association rule mining is a process of finding the frequent item sets based on the interestingness measure. The major challenge exists when performing the association of the data where privacy preservation is emphasized. The actual transaction data provides the evident to calculate the parameters for defining the association rules. In this paper, a solution is proposed to find one such parameter i.e. support count for item sets on the non transparent data, in other words the transaction data is not disclosed. The privacy preservation is ensured by transferring the x-anonymous records for every transaction record. All the anonymous set of actual transaction record perceives high generalized values. The clients process the anonymous set of every transaction record to arrive at high abstract values and these generalized values are used for support calculation. More the number of anonymous records, more the privacy of data is amplified. In experimental results it is shown that privacy is ensured with more number of formatted transactions

    SCARF: A Biomedical Association Rule Finding Webserver

    Get PDF
    The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying number of subjects with the required parameter ensembles. Specifically, finding combinatorial biomarkers for some given condition also needs a very large dataset to analyze. For fast and automatic multi-parametric relation discovery association-rule finding tools are used for more than two decades in the data-mining community. Here we present the SCARF webserver for generalized association rule mining. Association rules are of the form: a AND b AND … AND x → y, meaning that the presence of properties a AND b AND … AND x implies property y; our algorithm finds generalized association rules, since it also finds logical disjunctions (i.e., ORs) at the left-hand side, allowing the discovery of more complex rules in a more compressed form in the database. This feature also helps reducing the typically very large result-tables of such studies, since allowing ORs in the left-hand side of a single rule could include dozens of classical rules. The capabilities of the SCARF algorithm were demonstrated in mining the Alzheimer’s database of the Coalition Against Major Diseases (CAMD) in our recent publication (Archives of Gerontology and Geriatrics Vol. 73, pp. 300–307, 2017). Here we describe the webserver implementation of the algorithm
    corecore