Search CORE

2 research outputs found

Método Tres-Pasos para integrar fuertemente tareas de minería de datos en un sistema de base de datos relacional

Author: Timarán-Pereira Ricardo
Publication venue
Publication date: 28/03/2014
Field of study

In this paper, a result of the research project that aimed to define new algebraic operators and new SQL primitives for knowledge discovery in a tightly coupled architecture with a Relational Database Management System (RDBMS) is presented. In order to facilitate the tight coupling and to support the data mining tasks into the RDBMS engine, the three-step approach is proposed. In the first step, the relational algebra is extended with new algebraic operators to facilitate more expensive computationally processes of data mining tasks. In the next step and with the aim that the SQL language is relationally complete, these operators are defined as new primitives in the SELECT clause. In the last step, these primitives are unified into new SQL operator that runs a specific data mining task. Applying this method, new algebraic operators, new SQL primitives and new SQL operators for association and classification tasks were defined and were implemented into the PostgreSQL DBMS engine, giving it the capacity to discover association and classification rules efficiently.En este artículo se presenta uno de los resultados del proyecto de investigación cuyo objetivo fue definir nuevosoperadores algebraicos y nuevas primitivas SQL para el Descubrimiento de Conocimiento en una arquitecturafuertemente acoplada con un Sistema Gestor de Bases de Datos Relacional (SGBDR). Se propone el método trespasoscon el fin de facilitar el acoplamiento fuerte y soportar tareas de minería de datos al interior del motor de unSGBDR. En el primer paso, se extiende el álgebra relacional con nuevos operadores algebraicos que faciliten losprocesos computacionales más costosos de las tareas de minería de datos. En el siguiente paso y con el fin de queel lenguaje SQL sea relacionalmente completo, estos operadores son definidos como nuevas primitivas SQL en lacláusula SELECT. En el último paso, estas primitivas son unificadas en un nuevo operador SQL que ejecuta unatarea específica de minería de datos. Aplicando este método, se definieron nuevos operadores algebraicos, nuevasprimitivas y operadores SQL para las tareas de Asociación y Clasificación y fueron implementados al interiordel motor del SGBD PostgreSQL, dotándolo de la capacidad para descubrir reglas de asociación y clasificacióneficientemente

Biblioteca Digital de la Universidad del Valle

Data mining query language design and implementation.

Author
Publication venue
Publication date: 01/01/2004
Field of study

Xiaolei Yuan.Thesis submitted in: December 2003.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 95-101).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Background --- p.1Chapter 1.1.1 --- Data Mining: A New Wave of Database Applications --- p.1Chapter 1.1.2 --- Association Rule Mining --- p.4Chapter 1.2 --- Motivation --- p.7Chapter 1.3 --- Main Contribution --- p.8Chapter 1.4 --- Thesis Organization --- p.9Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- Data mining and association rule mining --- p.10Chapter 2.2 --- Integration data mining with DBMS --- p.11Chapter 2.3 --- Query language design for association rule mining --- p.12Chapter 2.4 --- Unified data mining models --- p.15Chapter 2.5 --- Other topics --- p.15Chapter 3 --- A New Data Mining Query Language M2MQL --- p.17Chapter 3.1 --- Simple item-based association rule --- p.18Chapter 3.1.1 --- One rule set --- p.19Chapter 3.1.2 --- Rule set and Source data set --- p.22Chapter 3.1.3 --- New rule sets from existing ones --- p.24Chapter 3.2 --- Generalized item-based association rules --- p.25Chapter 3.3 --- CREATE RULE and SELECT RULE Primitive --- p.32Chapter 4 --- The Algebra in M2MQL --- p.33Chapter 4.1 --- Review of nested relations --- p.33Chapter 4.1.1 --- Concepts of nested relation --- p.34Chapter 4.1.2 --- Nested relation and association rule mining --- p.35Chapter 4.2 --- Nested relational algebra --- p.36Chapter 4.3 --- Specific data mining algebra --- p.39Chapter 4.3.1 --- POWERSET p --- p.40Chapter 4.3.2 --- SET-CONTAINMENT-JOIN xc --- p.40Chapter 4.3.3 --- Functional operators --- p.42Chapter 5 --- Mining On Top of M2MQL --- p.50Chapter 5.1 --- Problem statement --- p.50Chapter 5.2 --- Frequency Counting Phase --- p.52Chapter 5.3 --- Frequent Itemset Generation Phase --- p.54Chapter 5.4 --- Rule Generation Phase --- p.57Chapter 5.5 --- Summary --- p.64Chapter 6 --- Conclusions and Future Work --- p.65Chapter 6.1 --- What we have achieved --- p.65Chapter 6.2 --- What is ahead --- p.66Chapter 6.2.1 --- Issues of Query Optimization --- p.66Chapter 6.2.2 --- Issues of Expanding Table Forms --- p.67Chapter A --- General Syntax of M2MQL --- p.68Chapter B --- Syntax and Example for MSQL --- p.71Chapter B.1 --- Syntax of MSQL --- p.71Chapter B.2 --- Example --- p.73Chapter C --- Syntax and Example for MINE RULE --- p.76Chapter C.1 --- syntax of MINE RULE --- p.76Chapter C.2 --- Example --- p.77Chapter C.2.1 --- Counting Groups --- p.78Chapter C.2.2 --- Making Couples of Clusters --- p.79Chapter C.2.3 --- Extracting Bodies --- p.80Chapter C.2.4 --- Extracting Rules --- p.80Bibliography --- p.8

CUHK Digital Repository