Search CORE

67,213 research outputs found

Set-Oriented Mining for Association Rules in Relational Databases

Author: Houtsma M.A.W.
Swami A.
Publication venue: IEEE
Publication date: 01/01/1995
Field of study

Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extension

University of Twente Research Information

Optimization of query evaluation algorithms

Author: KNUTH D.
PECHERER R.M.
ROTHNIE J.B.
S. Bing Yao
TSICHRITZIS D.
WONG E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Semantic Query Optimization for Bottom-Up Evaluation

Author: Godfrey Parke
Gryz J.
Minker Jack
Publication venue
Publication date: 15/10/1998
Field of study

Semantic query optimization uses semantic knowledge in databases (represented in the form of integrity constraints) to rewrite queries and logic programs for the purpose of more efficient query evaluation. Much work has been done to develop various techniques for optimization. Most of it, however, is only applicable to top-down query evaluation strategies. Moreover, little attention has been paid to the cost of the optimization itself. In this paper, we address the issue of semantic query optimization for bottom-up query evaluation strategies with an emphasis on overall efficiency. We restrict our attention to a single optimization technique, join elimination. We discuss various factors that influence the cost of semantic optimization, and present two abstract algorithms for different optimization approaches. The first one pre-processes a query statically before it is evaluated; the second approach combines query evaluation with semantic optimization using heuristics to achieve the largest possible savings. (Also cross-referenced as UMIACS-TR-95-109

Digital Repository at the University of Maryland

Set-oriented data mining in relational databases

Author: Houtsma Maurice
Swami Arun
Publication venue: North Holland
Publication date: 01/01/1995
Field of study

Data mining is an important real-life application for businesses. It is critical to find efficient ways of mining large data sets. In order to benefit from the experience with relational databases, a set-oriented approach to mining data is needed. In such an approach, the data mining operations are expressed in terms of relational or set-oriented operations. Query optimization technology can then be used for efficient processing.\ud \ud In this paper, we describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and thus may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. Algorithm SETM uses only simple database primitives, viz., sorting and merge-scan join. Algorithm SETM is simple, fast, and stable over the range of parameter values. It is easily parallelized and we suggest several additional optimizations. The set-oriented nature of Algorithm SETM makes it possible to develop extensions easily and its performance makes it feasible to build interactive data mining tools for large databases

CiteSeerX

University of Twente Research Information

Algorithm Choice For Multiple-Query Evaluation

Author: Dietz Henry G.
Kang Myong H.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/08/1989
Field of study

Traditional query optimization concentrates on the optimization of the execution of each individual query. More recently, it has been observed that by considering a sequence of multiple queries some additional high-level optimizations can be performed. Once these optimizations have been performed, each operation is translated into executable code. The fundamental insight in this paper is that significant improvements can be gained by careful choice of the algorithm to be used for each operation. This choice is not merely based on efficiency of algorithms for individual operations, but rather on the efficiency of the algorithm choices for the entire multiple-query evaluation. An efficient procedure for automatically optimizing these algorithm choices is given

Purdue E-Pubs

Query DAGs: A Practical Paradigm for Implementing Belief-Network Inference

Author: Darwiche A.
Provan G.
Publication venue
Publication date: 01/01/1996
Field of study

We describe a new paradigm for implementing inference in belief networks, which consists of two steps: (1) compiling a belief network into an arithmetic expression called a Query DAG (Q-DAG); and (2) answering queries using a simple evaluation algorithm. Each node of a Q-DAG represents a numeric operation, a number, or a symbol for evidence. Each leaf node of a Q-DAG represents the answer to a network query, that is, the probability of some event of interest. It appears that Q-DAGs can be generated using any of the standard algorithms for exact inference in belief networks (we show how they can be generated using clustering and conditioning algorithms). The time and space complexity of a Q-DAG generation algorithm is no worse than the time complexity of the inference algorithm on which it is based. The complexity of a Q-DAG evaluation algorithm is linear in the size of the Q-DAG, and such inference amounts to a standard evaluation of the arithmetic expression it represents. The intended value of Q-DAGs is in reducing the software and hardware resources required to utilize belief networks in on-line, real-world applications. The proposed framework also facilitates the development of on-line inference on different software and hardware platforms due to the simplicity of the Q-DAG evaluation algorithm. Interestingly enough, Q-DAGs were found to serve other purposes: simple techniques for reducing Q-DAGs tend to subsume relatively complex optimization techniques for belief-network inference, such as network-pruning and computation-caching.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

New Database Architecture for Smart Query Handler of Spatial Database

Author: Boyal Parthasarathi
Chaki Rituparna
Publication venue: Published by Elsevier Ltd.
Publication date: 31/12/2012
Field of study

AbstractA spatial database system is a database system with additional capabilities for handling spatial data. It also supports spatial data types in its implementation, providing spatial indexing and efficient algorithms for spatial join. The retrieval of data values from a spatial database involves searching through the huge repository of data, involving huge cost. Thus query optimization on spatial database takes more time as compared to RDBMS. The current state of art shows that during the execution of a query in a spatial database management system (SDBMS), the query optimizer creates all possible query evaluation plans. All plans are equivalent in term of their final output but vary in their execution cost, the amount of time to run. Once the data is retrieved the query and its plans are deleted from memory to free the space for future usage. This is repeated for the next query even if the query is already executed. This leads to increased storage overhead and execution time. In this paper, a new database architecture is proposed, which uses a buffer based query optimization technique for faster data retrieval

Elsevier - Publisher Connector

Main Memory Implementations for Binary Grouping

Author: D. Bitton
D.E. Simmen
G. Graefe
G. Graefe
J. Bercken Van den
L.M. Haas
T. Corman
Publication venue
Publication date: 01/01/2005
Field of study

An increasing number of applications depend on efficient storage and analysis features for XML data. Hence, query optimization and efficient evaluation techniques for the emerging XQuery standard become more and more important. Many XQuery queries require nested expressions. Unnesting them often introduces binary grouping. We introduce several algorithms implementing binary grouping and analyze their time and space complexity. Experiments demonstrate their performance

CiteSeerX

Crossref

MAnnheim DOCument Server