Search CORE

7 research outputs found

Data Mining in Databases: Languages and Indices

Author: Elena Baralis
Meo Rosa
Silvia Chiusano
Tania Cerquitelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints

Author: Botta M.
Esposito R.
Gallo A.
Meo R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Institutional Research Information System University of Turin

Query Rewriting in Itemset Mining

Author: Botta Marco
Esposito Roberto
Meo Rosa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Abstract. In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user interacts with the DBMS using advanced, constraint-based languages for data mining where constraints have been specifically introduced to increase the relevance of the results and, at the same time, to reduce its volume. In this paper we study the problem of mining frequent itemsets using an inductive database 1 . We propose a technique for query answering which consists in rewriting the query in terms of union and intersection of the result sets of other queries, previously executed and materialized. Unfortunately, the exploitation of past queries is not always applicable. We then present sufficient conditions for the optimization to apply and show that these conditions are strictly connected with the presence of functional dependencies between the attributes involved in the queries. We show some experiments on an initial prototype of an optimizer which demonstrates that this approach to query answering is not only viable but in many practical cases absolutely necessary since it reduces drastically the execution time

CiteSeerX

Institutional Research Information System University of Turin

Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints

Author: Arianna Gallo
Marco Botta
Roberto Esposito
Rosa Meo
Publication venue
Publication date: 10/04/2020
Field of study

Abstract. In recent years, the KDD process has been advocated to be an iterative and interactive process. It is seldom the case that a user is able to answer immediately with a single query all his questions on data. On the contrary, the workflow of the typical user consists in several steps, in which he/she iteratively refines the extracted knowledge by inspecting previous results and posing new queries. Given this view of the KDD process, it becomes crucial to have KDD systems that are able to exploit past results thus minimizing computational effort. This is expecially true in environments in which the system knowledge base is the result of many discoveries on data made separately by the collaborative effort of different users. In this paper, we consider the problem of mining frequent association rules from database relations. We model a general, constraint-based, mining language for this task and study its properties w.r.t. the problem of re-using past results. In particular, we individuate two class of query constraints, namely "item dependent" and "context dependent" ones, and show that the latter are more difficult than the former ones. Then, we propose two newly developed algorithms which allow the exploitation of past results in the two cases. Finally, we show that the approach is both effective and viable by experimenting on some datasets

CiteSeerX

Query rewriting in itemset mining

Author: Marco Botta
Roberto Esposito
Rosa Meo
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2004
Field of study

CiteSeerX

Exploiting succinct constraints using FP-trees

Author: Agrawal R.
Agrawal R.
Bayardo R. J.
Carson Kai-Sang Leung
Garofalakis M. N.
Laks V. S. Lakshmanan
Pei J.
Raymond T. Ng
Silverstein C.
Srikant R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

New approaches to weighted frequent pattern mining

Author: Yun Unil
Publication venue: Texas A&M University
Publication date: 25/04/2007
Field of study

Researchers have proposed frequent pattern mining algorithms that are more efficient than previous algorithms and generate fewer but more important patterns. Many techniques such as depth first/breadth first search, use of tree/other data structures, top down/bottom up traversal and vertical/horizontal formats for frequent pattern mining have been developed. Most frequent pattern mining algorithms use a support measure to prune the combinatorial search space. However, support-based pruning is not enough when taking into consideration the characteristics of real datasets. Additionally, after mining datasets to obtain the frequent patterns, there is no way to adjust the number of frequent patterns through user feedback, except for changing the minimum support. Alternative measures for mining frequent patterns have been suggested to address these issues. One of the main limitations of the traditional approach for mining frequent patterns is that all items are treated uniformly when, in reality, items have different importance. For this reason, weighted frequent pattern mining algorithms have been suggested that give different weights to items according to their significance. The main focus in weighted frequent pattern mining concerns satisfying the downward closure property. In this research, frequent pattern mining approaches with weight constraints are suggested. Our main approach is to push weight constraints into the pattern growth algorithm while maintaining the downward closure property. We develop WFIM (Weighted Frequent Itemset Mining with a weight range and a minimum weight), WLPMiner (Weighted frequent Pattern Mining with length decreasing constraints), WIP (Weighted Interesting Pattern mining with a strong weight and/or support affinity), WSpan (Weighted Sequential pattern mining with a weight range and a minimum weight) and WIS (Weighted Interesting Sequential pattern mining with a similar level of support and/or weight affinity) The extensive performance analysis shows that suggested approaches are efficient and scalable in weighted frequent pattern mining

Texas A&M Repository