7 research outputs found

    Finding Exception For Association Rules Via SQL Queries

    Get PDF
    Finding association rules is mainly based on generating larger and larger frequent set candidates, starting from frequent attributes in the database. The frequent sets can be organised as a part of a lattice of concepts according to the Formal Concept Analysis approach. Since the lattice construction is database contents-dependent, the pseudo-intents (see Formal Concept Analysis) are avoided. Association rules between concept intents (closed sets) A=>B are partial implication rules, meaning that there is some data supporting A and (not B); fully explaining the data requires finding exceptions for the association rules. The approach applies to Oracle databases, via SQL queries

    Mineração de dados para modelagem de dependência usando algoritmos genéticos

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computação.O desafio da área de Descoberta de Conhecimento em Bancos de Dados, ou KDD, é analisar de forma eficiente e automática a grande massa de informações disponível, extraindo conhecimento útil. Neste trabalho, apresentamos GenMiner, uma ferramenta de Mineração de Dados para a tarefa de Modelagem de Dependência. Um algoritmo genético, método de otimização da Computação Evolucionária, foi desenvolvido para descobrir regras interessantes em bases de dados relacionais. A avaliação das regras é realizada individualmente, favorecendo regras com alta precisão e, preferencialmente, surpreendentes. A integração a bases de dados relacionais foi viabilizada pela codificação dos cromossomos como expressões em linguagem SQL. GenMiner foi avaliado usando uma base de dados de domínio público, com informações sobre diversos países e suas bandeiras

    From purchase, usage, to upgrade — Consumer analytics using large scale transactional data

    Get PDF
    The amount of data businesses are collecting about their customers is staggering. Firms can now easily track and record past purchases, product usage patterns, and customers’ responses to marketing campaigns and promotion programs. If fully analyzed, such rich transaction data offers companies the opportunity to understand what drives customers’ purchase decisions, how to improve their shopping experience, and how to develop and retain loyal customers. My dissertation addresses these issues by applying consumer analytics, including association rule mining, survival analysis, econometrics, and optimization, on large-scale transactional data to help companies better understand, predict, and subsequently influence the consumption behavior of their customers. My dissertation comprises three essays. The first essay utilizes multi-level association rule mining to predict project-oriented purchases. In the second essay, I propose an Expo-Decay proportional hazard model and use customers’ adoptions and usage of previous product generations to predict their upgrade behaviors for the current product generation. In the third essay, a time-based dynamic synchronization policy is applied for the maintenance of consolidated data repository under an infinite planning horizon. In these essays, I apply and extend a variety of business analytics tools including data mining (association rule mining and collaborative filtering), survival analysis, dynamic programming, simulation, and econometric models. These essays contribute to the consumer analytics literature and can help firms maintain high-quality data assets and make informed decisions on cross-generation product development, product promotion and recommendation, and customer retention

    Mining Generalized Association Rules and Sequential Patterns Using SQL Queries

    No full text
    Database integration of mining is becoming increasingly important with the installation of larger and larger data warehouses built around relational database technology. Most of the commercially available mining systems integrate loosely (typically, through an ODBC or SQL cursor interface) with data stored in DBMSs. In cases where the mining algorithm makes multiple passes over the data, it is also possible to cache the data in at les rather than retrieve multiple times from the DBMS, to achieve better performance. Recent studies have found that for association rule mining, with carefully tuned SQL formulations it is possible to achieve performance comparable to systems that cache the data in les outside the DBMS. The SQL implementation has potential for oering other qualitative advantages like automatic parallelization, development ease, portability and inter-operability with relational operators. In this paper, we present several alternatives for formulating as ..

    Data mining query language design and implementation.

    Get PDF
    Xiaolei Yuan.Thesis submitted in: December 2003.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 95-101).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Background --- p.1Chapter 1.1.1 --- Data Mining: A New Wave of Database Applications --- p.1Chapter 1.1.2 --- Association Rule Mining --- p.4Chapter 1.2 --- Motivation --- p.7Chapter 1.3 --- Main Contribution --- p.8Chapter 1.4 --- Thesis Organization --- p.9Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- Data mining and association rule mining --- p.10Chapter 2.2 --- Integration data mining with DBMS --- p.11Chapter 2.3 --- Query language design for association rule mining --- p.12Chapter 2.4 --- Unified data mining models --- p.15Chapter 2.5 --- Other topics --- p.15Chapter 3 --- A New Data Mining Query Language M2MQL --- p.17Chapter 3.1 --- Simple item-based association rule --- p.18Chapter 3.1.1 --- One rule set --- p.19Chapter 3.1.2 --- Rule set and Source data set --- p.22Chapter 3.1.3 --- New rule sets from existing ones --- p.24Chapter 3.2 --- Generalized item-based association rules --- p.25Chapter 3.3 --- CREATE RULE and SELECT RULE Primitive --- p.32Chapter 4 --- The Algebra in M2MQL --- p.33Chapter 4.1 --- Review of nested relations --- p.33Chapter 4.1.1 --- Concepts of nested relation --- p.34Chapter 4.1.2 --- Nested relation and association rule mining --- p.35Chapter 4.2 --- Nested relational algebra --- p.36Chapter 4.3 --- Specific data mining algebra --- p.39Chapter 4.3.1 --- POWERSET p --- p.40Chapter 4.3.2 --- SET-CONTAINMENT-JOIN xc --- p.40Chapter 4.3.3 --- Functional operators --- p.42Chapter 5 --- Mining On Top of M2MQL --- p.50Chapter 5.1 --- Problem statement --- p.50Chapter 5.2 --- Frequency Counting Phase --- p.52Chapter 5.3 --- Frequent Itemset Generation Phase --- p.54Chapter 5.4 --- Rule Generation Phase --- p.57Chapter 5.5 --- Summary --- p.64Chapter 6 --- Conclusions and Future Work --- p.65Chapter 6.1 --- What we have achieved --- p.65Chapter 6.2 --- What is ahead --- p.66Chapter 6.2.1 --- Issues of Query Optimization --- p.66Chapter 6.2.2 --- Issues of Expanding Table Forms --- p.67Chapter A --- General Syntax of M2MQL --- p.68Chapter B --- Syntax and Example for MSQL --- p.71Chapter B.1 --- Syntax of MSQL --- p.71Chapter B.2 --- Example --- p.73Chapter C --- Syntax and Example for MINE RULE --- p.76Chapter C.1 --- syntax of MINE RULE --- p.76Chapter C.2 --- Example --- p.77Chapter C.2.1 --- Counting Groups --- p.78Chapter C.2.2 --- Making Couples of Clusters --- p.79Chapter C.2.3 --- Extracting Bodies --- p.80Chapter C.2.4 --- Extracting Rules --- p.80Bibliography --- p.8

    Mineração de dados de um plano de saúde para obter regras de associação

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Produção, Florianópolis, 2000As organizações estão investindo cada vez mais na exploração da informação e conhecimento existentes nos dados de suas atividades. A mineração de dados representa um conjunto de técnicas para obtenção de informação que não pode ser obtida através de consultas convencionais. Uma destas técnicas é denominada mineração de regras de associação. Regras de associação são expressões que indicam afinidade ou correlação entre dados. Este trabalho avalia o potencial de utilidade do algoritmo APRIORI, um indutor de regras de associação, aplicando-o a dados de um plano de saúde, apresentando os resultados obtidos e analisando seu significado
    corecore