Search CORE

22,466 research outputs found

A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

Author: Buyya Rajkumar
Ramamohanarao Kotagiri
Venugopal Srikumar
Publication venue
Publication date: 10/06/2005
Field of study

Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

arXiv.org e-Print Archive

CiteSeerX

University of Melbourne Institutional Repository

Heterogeneous Relational Databases for a Grid-enabled Analysis Environment

Author: Ali Arshad
Anjum Ashiq
Azim Tahir
Bunn Julian
Iqbal Saima
McClatchey Richard
Newman Harvey
Shah S. Yousaf
Solomonides Tony
Steenberg Conrad
Thomas Michael
van Lingen Frank
Willers Ian
Publication venue
Publication date: 01/01/2005
Field of study

Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism which can hide the heterogeneity of the backend databases from the client applications. This paper focuses on accessing data stored in disparate relational databases through a web service interface, and exploits the features of a Data Warehouse and Data Marts. We present a middleware that enables applications to access data stored in geographically distributed relational databases without being aware of their physical locations and underlying schema. A web service interface is provided to enable applications to access this middleware in a language and platform independent way. A prototype implementation was created based on Clarens [4], Unity [7] and POOL [8]. This ability to access the data stored in the distributed relational databases transparently is likely to be a very powerful one for Grid users, especially the scientific community wishing to collate and analyze data distributed over the Grid

arXiv.org e-Print Archive

Caltech Authors

A study of two transaction-processing architectures for distributed real-time database systems

Author: Ulusoy O.
Publication venue: 'Elsevier BV'
Publication date: 01/11/1995
Field of study

Cataloged from PDF version of article.A real-time data base system (RTDBS) is designed to provide timely response to the transactions of data-intensive applications. Processing a transaction in a distributed RTDBS environment presents the design choice of how to provide access to remote data referenced by the transaction. Satisfaction of the timing constraints of transactions should be the primary factor to be considered in scheduling accesses to remote data. In this article, we describe and analyze two different alternative approaches to this fundamental design decision. With the first alternative, transaction operations are executed at the sites where required data pages reside. The other alternative is based on transmitting data pages wherever they are needed. Although the latter approach is characterized by large message volumes carrying data pages, it is shown in our experiments to perform better than the other approach under most of the work loads and system configurations tested. The performance metric used in the evaluations is the fraction of transactions that satisfy their timing constraints. © 1995

Bilkent University Institutional Repository

Protocols for Integrity Constraint Checking in Federated Databases

Author: Grefen Paul
Widom Jennifer
Publication venue: Kluwer Academic Publishers
Publication date: 01/01/1996
Field of study

A federated database is comprised of multiple interconnected database systems that primarily operate independently but cooperate to a certain extent. Global integrity constraints can be very useful in federated databases, but the lack of global queries, global transaction mechanisms, and global concurrency control renders traditional constraint management techniques inapplicable. This paper presents a threefold contribution to integrity constraint checking in federated databases: (1) The problem of constraint checking in a federated database environment is clearly formulated. (2) A family of protocols for constraint checking is presented. (3) The differences across protocols in the family are analyzed with respect to system requirements, properties guaranteed by the protocols, and processing and communication costs. Thus, our work yields a suite of options from which a protocol can be chosen to suit the system capabilities and integrity requirements of a particular federated database environment

CiteSeerX

University of Twente Research Information

The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis

Author: Bender Gabriel
Ding Bailu
Foster Nate
Gehrke Johannes
Hojjat Hossein
Koch Christoph
Kot Lucja
Roy Sudip
Publication venue
Publication date: 19/01/2015
Field of study

Datastores today rely on distribution and replication to achieve improved performance and fault-tolerance. But correctness of many applications depends on strong consistency properties - something that can impose substantial overheads, since it requires coordinating the behavior of multiple nodes. This paper describes a new approach to achieving strong consistency in distributed systems while minimizing communication between nodes. The key insight is to allow the state of the system to be inconsistent during execution, as long as this inconsistency is bounded and does not affect transaction correctness. In contrast to previous work, our approach uses program analysis to extract semantic information about permissible levels of inconsistency and is fully automated. We then employ a novel homeostasis protocol to allow sites to operate independently, without communicating, as long as any inconsistency is governed by appropriate treaties between the nodes. We discuss mechanisms for optimizing treaties based on workload characteristics to minimize communication, as well as a prototype implementation and experiments that demonstrate the benefits of our approach on common transactional benchmarks

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Advanced information processing system: Inter-computer communication services

Author: Alger Linda S.
Burkhardt Laura
Masotto Tom
Sims J. Terry
Whittredge Roy
Publication venue
Publication date
Field of study

The purpose is to document the functional requirements and detailed specifications for the Inter-Computer Communications Services (ICCS) of the Advanced Information Processing System (AIPS). An introductory section is provided to outline the overall architecture and functional requirements of the AIPS and to present an overview of the ICCS. An overview of the AIPS architecture as well as a brief description of the AIPS software is given. The guarantees of the ICCS are provided, and the ICCS is described as a seven-layered International Standards Organization (ISO) Model. The ICCS functional requirements, functional design, and detailed specifications as well as each layer of the ICCS are also described. A summary of results and suggestions for future work are presented

NASA Technical Reports Server

Recommended from our members

A Comparison of Cache Performance in Server-Based and Symmetric Database Architectures

Author: Korz Frederick
Leff Avraham
Pu Calton
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1990
Field of study

We study the cache performance in a symmetric distributed main-memory database. The high performance networks in many large distributed systems enable a machine to reach the main memory of other nodes more quickly than the time to access local disks. We therefore introduce remote memory as an additional layer in the memory hierarchy between local memory and disks. In order to appreciate the tradeoffs of memory and cpu in the symmetric architecture, we compare system performance in alternative architectures. Simulations show that, by exploiting remote memory (in each node‘s cache), performance improves over a wide range of cache sizes as compared to a distributed client/server architecture. We also compare the symmetric model to a centralized-server model and parameterize the performance tradeoffs

Columbia University Academic Commons

A Covert Channel Using Named Resources

Author: Davis Joshua
Frost Victor S.
Publication venue
Publication date: 20/08/2014
Field of study

A network covert channel is created that uses resource names such as addresses to convey information, and that approximates typical user behavior in order to blend in with its environment. The channel correlates available resource names with a user defined code-space, and transmits its covert message by selectively accessing resources associated with the message codes. In this paper we focus on an implementation of the channel using the Hypertext Transfer Protocol (HTTP) with Uniform Resource Locators (URLs) as the message names, though the system can be used in conjunction with a variety of protocols. The covert channel does not modify expected protocol structure as might be detected by simple inspection, and our HTTP implementation emulates transaction level web user behavior in order to avoid detection by statistical or behavioral analysis.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX