Search CORE

205 research outputs found

Parallel Association Rule Mining by Data De-Clustering to Support Grid Computing

Author: Chen Pey-Yen
Tseng Frank
Publication venue: AIS Electronic Library (AISeL)
Publication date: 31/12/2005
Field of study

AIS Electronic Library (AISeL)

Materialized views and data warehouses

Author: Adiba M. E.
Amir A.
Astrahan M. M.
Baekgraard Lars
Blakeley J. A.
Blakeley Jose A.
Buchheit M.
Chen C. M.
Chen Chungmin Melvin
Delis A.
Faloutsos C.
Finkelstein S.
Gray J.
Gupta Ashish
Gupta H.
Hanson E.
Hanson Eric N.
Hellerstein J. M.
Jensen C. S.
Jhingran A.
Larson A.
Mumick In Inderpal
Nick Roussopoulos
Papakonstantinou Y.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos N.
Roussopoulos Nick
Roussopoulos Nick
Sellis T.
Stamenas Antonios G.
Stonebraker M.
Valduriez Patrick
Zhuge Yue
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Implementation and Performance Evaluation of a Parallel Transitive Closure Algorithms on PRISMA/DB

Author: Flokstra J.
Houtsma M.A.W.
Wilschut A.N.
Publication venue: Morgan Kaufmann Publishers Inc.
Publication date: 01/01/1992
Field of study

This paper is one of the first to discuss actual implementation of and experimentation with parallel transitive closure operations on a full-fledged relational database system. It brings two research efforts together; the development of an efficient execution strategy for parallel computation of path problems, called Disconnection Set Approach, and the development and implementation of a parallel, main-memory DBMS, called PRISMA/DB. First, we report on the implementation of the disconnection set approach on PRISMA/DB, showing how the latter's design allowed us to easily extend the functionality of the system. Second, we investigate the disconnection set approach's parallel behavior and performance by means of extensive experimentation. It is shown that the parallel implementation of the disconnection set approach yields very good performance characteristics, and that (super)linear speedup w.r.t. a special implementation of semi-naive is achieved for regular, so-called linear fragmenta..

CiteSeerX

University of Twente Research Information

Parallel evaluation of multi-join queries

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1995
Field of study

Crossref

Fractals for Secondary Key Retrieval

Author: Faloutsos Christos
Roseman Shari
Publication venue
Publication date: 15/10/1998
Field of study

In this paper we propose the use of fractals and especially the Hilbert curve, in order to design good distance-preserving mappings. Such mappings improve the performance of secondary-key- and spatial- access methods, where multi-dimensional points have to be stored on an 1-dimensional medium (e.g., disk). Good clustering reduces the number of disk accesses on retrieval, improving the response time. Our experiments on range queries and nearest neighbor queries showed that the proposed Hilbert curve achieves better clustering than older methods ("bit-shuffling", or Peano curve), for every situation we tried. (Also cross-referenced as UMIACS-TR-89-47

Digital Repository at the University of Maryland

Spinning Relations: High-Speed Networks for Distributed Join Processing

Author: Frey P.W. (Philip)
Kersten M.L. (Martin)
Pereira Goncalves R.A. (Romulo Antonio)
Teubner J. (Jens)
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2009
Field of study

By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed database processing significantly. Complex and sophisticated mechanisms to avoid network traffic can be replaced by a scheme that takes advantag

CWI's Institutional Repository

Combining Theory and Practice in Integrity Control: A Declarative Approach to the Specification of a Transaction Modification Subsystem

Author: Grefen Paul W.P.J.
Publication venue: Morgan Kaufmann
Publication date: 01/01/1993
Field of study

Integrity control is generally considered an important topic in the field of database system research. In the database literature, many proposals for integrity control mechanisms canbe found. A large group of proposals has a formal character, and does not cover complete algorithms that can be used in a real-world database system with multi-update transactions. Another group of proposals is system-oriented and often lacks a complete formalbackground on transactions and integrity control; algorithms are usually described in system terms. This paper combines the essentials of both groups: it presents a declarative specification of a transaction-based integrity control technique that has a solidformal basis and can easily be applied in real-world database systems. The technique, called transaction modification, features simple semantics, full transaction support, and extensibility to parallel data processing. These claims are supported by a prototype implementation of a transaction modification subsystem in the high-performance PRISMA/DB database system. This paper shows that it is well possible for an integrity control technique tocombine a formal approach with complete functionality and high performance

University of Twente Research Information

Optimization of Queries with Conjunction of Predicates

Author: Tudor Nicoleta Liviana
Publication venue: Agora University Press
Publication date: 01/09/2007
Field of study

A method to optimize the access at the objects of a relational database is through the optimization of the queries. This article presents an approach of the cost model used in optimization of Select-Project-Join (SPJ) queries with conjunction of predicates and proposes a join optimization algorithm named System RO-H (System Rank Ordering Heuristic). The System RO-H algorithm for optimizing SPJ queries with conjunction of predicates is a System R Dynamic Programming algorithm that extends optimal linear join subplans using a rank-ordering heuristic method as follows: choosing a predicate in ascending order according to the h-metric, where the h-metric depends on the selectivity and the cost per tuple of the predicate, using an expression with heuristic constants.The System Rank-Ordering Heuristic algorithm finds an optimal plan in the space of linear left deep join trees. The System RO-H algorithm saves not a single plan, but multiple optimal plans for every subset, one for each distinct such order, termed interesting order. In order to build an optimal execution plan for a set S of i relations, the optimal plan for each subset of S, consisting of i-1 relations is extended, using the Lemma based on a h-metric for predicates. Optimal plans for subsets are stored and reused. The optimization algorithm chooses a plan of least cost from the execution space

Agora University Editing House: Journals

Scalability analysis of declustering methods for multidimensional range queries

Author: Bongki Moon
Joel H. Saltz
Publication venue
Publication date: 01/01/1998
Field of study

Abstract—Efficient storage and retrieval of multiattribute data sets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multiattribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files across multiple disks to obtain high performance for disk accesses. Although the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper, we derive formulas describing the scalability of two popular declustering methods¦Disk Modulo and Fieldwise Xor¦for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods, and this is corroborated by extensive simulation experiments. From the practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions

CiteSeerX