7,782 research outputs found

    A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

    Full text link
    The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

    Distributed Database Design: A Case Study

    Get PDF
    Data Allocation is an important problem in Distributed Database Design. Generally, evolutionary algorithms are used to determine the assignments of fragments to sites. Data Allocation Algorithms should handle replication, query frequencies, quality of service (QoS), cite capacities, table update costs, selection and projection costs. Most of the algorithms in the literature attack one or few components of the problem. In this paper, we present a case study considering all of these features. The proposed model uses Integer Linear Programming for the formulation of the problem. (C) 2014 The Authors. Published by Elsevier B.V

    On hierarchical clustering-based approach for RDDBS design

    Get PDF
    Distributed database system (DDBS) design is still an open challenge even after decades of research, especially in a dynamic network setting. Hence, to meet the demands of high-speed data gathering and for the management and preservation of huge systems, it is important to construct a distributed database for real-time data storage. Incidentally, some fragmentation schemes, such as horizontal, vertical, and hybrid, are widely used for DDBS design. At the same time, data allocation could not be done without first physically fragmenting the data because the fragmentation process is the foundation of the DDBS design. Extensive research have been conducted to develop effective solutions for DDBS design problems. But the great majority of them barely consider the RDDBS\u27s initial design. Therefore, this work aims at proposing a clustering-based horizontal fragmentation and allocation technique to handle both the early and late stages of the DDBS design. To ensure that each operation flows into the next without any increase in complexity, fragmentation and allocation are done simultaneously. With this approach, the main goals are to minimize communication expenses, response time, and irrelevant data access. Most importantly, it has been observed that the proposed approach may effectively expand RDDBS performance by simultaneously fragmenting and assigning various relations. Through simulations and experiments on synthetic and real databases, we demonstrate the viability of our strategy and how it considerably lowers communication costs for typical access patterns at both the early and late stages of design

    Solving Large Scale Instances of the Distribution Design Problem Using Data Mining

    Get PDF
    In this paper we approach the solution of large instances of the distribution design problem. The traditional approaches do not consider that the instance size can significantly reduce the efficiency of the solution process. We propose a new approach that includes compression methods to transform the original instance into a new one using data mining techniques. The goal of the transformation is to condense the operation access pattern of the original instance to reduce the amount of resources needed to solve the original instance, without significantly reducing the quality of its solution. In order to validate the approach, we tested it proposing two instance compression methods on a new model of the replicated version of the distribution design problem that incorporates generalized database objects. The experimental results show that our approach permits to reduce the computational resources needed for solving large instances by at least 65%, without significantly reducing the quality of its solution. Given the encouraging results, at the moment we are working on the design and implementation of efficient instance compression methods using other data mining techniques

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    A threshold based dynamic data allocation algorithm - a Markov Chain model approach

    Get PDF
    In this study, a new dynamic data allocation algorithm for non-replicated Distributed Database Systems (DDS), namely the threshold algorithm, is formulated and proposed. The threshold algorithm reallocates data with respect to changing data access patterns. The proposed algorithm is distributed in the sense that each node autonomously decides whether to transfer the ownership of a fragment in DDS to another node or not. The transfer decision depends on the past accesses of the fragment. Each fragment continuously migrates ftom the node where it is not accessed locally more than a certain number of past accesses, namely a threshold value. The threshold algorithm is modeled for a fragment of the database as a finite Markov chain with constant node access probabilities. In the model, a special case, where all nodes have equal access probabilities except one with a different access probability, is analyzed. It has been shown that for positive threshold values the fragment will tend to remain at the node with the higher access probability. It is also shown that the greater the threshold values are, the greater the tendency of the fragment to remain at the node with higher access probability will be. The threshold algorithm is especially suitable for a DDS where data access pattern changes dynamically

    A Methodology for Vertically Partitioning in a Multi-Relation Database Environment

    Get PDF
    Vertical partitioning, in which attributes of a relation are assigned to partitions, is aimed at improving database performance. We extend previous research that is based on a single relation to multi-relation database environment, by including referential integrity constraints, access time based heuristic, and a comprehensive cost model that considers most transaction types including updates and joins. The algorithm was applied to a real-world insurance CLAIMS database. Simulation experiments were conducted and the results show a performance improvement of 36% to 65% over unpartitioned case. Application of our method for small databases resulted in partitioning schemes that are comparable to optimal.Facultad de Informátic

    A Methodology for Vertically Partitioning in a Multi-Relation Database Environment

    Get PDF
    Vertical partitioning, in which attributes of a relation are assigned to partitions, is aimed at improving database performance. We extend previous research that is based on a single relation to multi-relation database environment, by including referential integrity constraints, access time based heuristic, and a comprehensive cost model that considers most transaction types including updates and joins. The algorithm was applied to a real-world insurance CLAIMS database. Simulation experiments were conducted and the results show a performance improvement of 36% to 65% over unpartitioned case. Application of our method for small databases resulted in partitioning schemes that are comparable to optimal.Facultad de Informátic
    corecore