15,241 research outputs found
Incremental View Maintenance For Collection Programming
In the context of incremental view maintenance (IVM), delta query derivation
is an essential technique for speeding up the processing of large, dynamic
datasets. The goal is to generate delta queries that, given a small change in
the input, can update the materialized view more efficiently than via
recomputation. In this work we propose the first solution for the efficient
incrementalization of positive nested relational calculus (NRC+) on bags (with
integer multiplicities). More precisely, we model the cost of NRC+ operators
and classify queries as efficiently incrementalizable if their delta has a
strictly lower cost than full re-evaluation. Then, we identify IncNRC+; a large
fragment of NRC+ that is efficiently incrementalizable and we provide a
semantics-preserving translation that takes any NRC+ query to a collection of
IncNRC+ queries. Furthermore, we prove that incremental maintenance for NRC+ is
within the complexity class NC0 and we showcase how recursive IVM, a technique
that has provided significant speedups over traditional IVM in the case of flat
queries [25], can also be applied to IncNRC+.Comment: 24 pages (12 pages plus appendix
Comparison of dynamic rule mining algorithms
Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2012Includes bibliographical references (leaves: 43-46)Text in English; Abstract: Turkish and Englishx, 59 leavesIn real life, new data is constantly added to databases while the existing one is modified or deleted. The new challenge of association rule mining is the need to always maintain meaningful association rules whenever the databases are updated. Many dynamic algorithms that use different techniques have been proposed in the past to deal with this challenge. However less work has been done in comparing their performance. In this study comparison of two dynamic rule mining algorithms; Dynamic Matrix Apriori and Fast Update 2, which have not been compared in the past, is done. The algorithms are tested on three different datasets to determine their execution time with updates of: additions, deletions and different support thresholds. Our findings reveal that DMA performs better with two dataset and so is FUP2 with the other dataset. The difference in performance of the two algorithms is mainly caused by the nature of the datasets
A study on incremental mining of frequent patterns
Data generated from both the offline and online sources are incremental in nature. Changes in the underlying database occur due to the incremental data. Mining frequent patterns are costly in changing databases, since it requires scanning the database from the start. Thus, mining of growing databases has been a great concern. To mine the growing databases, a new Data Mining technique called Incremental Mining has emerged. The Incremental Mining uses previous mining result to get the desired knowledge by reducing mining costs in terms of time and space. This state of the art paper focuses on Incremental Mining approaches and identifies suitable approaches which are the need of real world problem.Keywords: Data Mining, Frequent Pattern, Incremental Mining, Frequent Pattern Minung, High Utility Mining, Constraint Mining
IMPLEMENTATION OF DYNAMIC AND FAST MINING ALGORITHMS ON INCREMENTAL DATASETS TO DISCOVER QUALITATIVE RULES
Association Rule Mining is an important field in knowledge mining that allows the rules of association needed for decision making. Frequent mining of objects presents a difficulty to huge datasets. As the dataset gets bigger and more time and burden to uncover the rules. In this paper, overhead and time-consuming overhead reduction techniques with an IPOC (Incremental Pre-ordered code) tree structure were examined. For the frequent usage of database mining items, those techniques require highly qualified data structures. FIN (Frequent itemset-Nodeset) employs a node-set, a unique and new data structure to extract frequently used Items and an IPOC tree to store frequent data progressively. Different methods have been modified to analyze and assess time and memory use in different data sets. The strategies suggested and executed shows increased performance when producing rules, using time and efficiency
- …