Search CORE

1,505 research outputs found

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX

A database management capability for Ada

Author: Chan Arvola
Danberg SY
Fox Stephen
Landers Terry
Nori Anil
Smith John M.
Publication venue
Publication date
Field of study

The data requirements of mission critical defense systems have been increasing dramatically. Command and control, intelligence, logistics, and even weapons systems are being required to integrate, process, and share ever increasing volumes of information. To meet this need, systems are now being specified that incorporate data base management subsystems for handling storage and retrieval of information. It is expected that a large number of the next generation of mission critical systems will contain embedded data base management systems. Since the use of Ada has been mandated for most of these systems, it is important to address the issues of providing data base management capabilities that can be closely coupled with Ada. A comprehensive distributed data base management project has been investigated. The key deliverables of this project are three closely related prototype systems implemented in Ada. These three systems are discussed

NASA Technical Reports Server

Deductive Optimization of Relational Data Storage

Author: Feser John K.
Madden Samuel
Solar-Lezama Armando
Tang Nan
Publication venue
Publication date: 05/02/2020
Field of study

Optimizing the physical data storage and retrieval of data are two key database management problems. In this paper, we propose a language that can express a wide range of physical database layouts, going well beyond the row- and column-based methods that are widely used in database management systems. We use deductive synthesis to turn a high-level relational representation of a database query into a highly optimized low-level implementation which operates on a specialized layout of the dataset. We build a compiler for this language and conduct experiments using a popular database benchmark, which shows that the performance of these specialized queries is competitive with a state-of-the-art in memory compiled database system

arXiv.org e-Print Archive

DSpace@MIT

Flattening an object algebra to provide performance

Author: Boncz P.
Kersten M.L.
Wilschut A.N.
Publication venue: IEEE Computer Society
Publication date: 01/01/1998
Field of study

Algebraic transformation and optimization techniques have been the method of choice in relational query execution, but applying them in object-oriented (OO) DBMSs is difficult due to the complexity of OO query languages. This paper demonstrates that the problem can be simplified by mapping an OO data model to the binary relational model implemented by Monet, a state-of-the-art database kernel. We present a generic mapping scheme to flatten data models and study the case of straightforward OO model. We show how flattening enabled us to implement a query algebra, using only a very limited set of simple operations. The required primitives and query execution strategies are discussed, and their performance is evaluated on the 1-GByte TPC-D (Transaction-processing Performance Council's Benchmark D), showing that our divide-and-conquer approach yields excellent result

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

International Migration, Integration and Social Cohesion online publications