Search CORE

2 research outputs found

I/O-Efficient Data-Intensive Computing /

Author: Rasmussen Alexander Carlin
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

There is a growing need for scalable, data-intensive processing platforms to analyze and filter large volumes of data. The effectiveness of these systems is measured by the quantity and quality of data that they can process in a reasonable amount of time; thus, these systems have very high I/O and storage requirements. Existing systems are very effective at scaling to large cluster sizes. Unfortunately, there exists a significant gap between the performance these systems provide and the underlying capacity of the hardware infrastructure on which they are deployed. In this dissertation, we endeavor to bridge this performance gap by focusing on efficient I/O as a first- class architectural concern. In particular, we present two systems, TritonSort and Themis. TritonSort is a high- performance large-scale sorting system capable of sorting 100TB of data on a modestly-sized cluster at about 82% of that cluster's peak hardware performance. Themis is a successor system to TritonSort that supports the popular MapReduce programming paradigm and can run a wide spectrum of MapReduce jobs at nearly the speed at which TritonSort can sort. We conclude with the implementation of a fault tolerance scheme for Themis that provides proportional fault tolerance without imposing additional rounds of I/O during common-case operatio

Ezid

eScholarship - University of California

Recommended from our members

Dissertation: I/O-Efficient Data-Intensive Computing

Author: Rasmussen Alex C.
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

eScholarship - University of California