On the expressiveness and trade-offs of large scale tuple stores

A. Fox; A. Java; B. Krishnamurthy; C. Olston; E. Meijer; E.A. Brewer; G. DeCandia; M. Zhong; M.P. Herlihy; P. Nadkarni; P.A. Bernstein; R. Pike; R. Vilaça; S. Ghemawat

research

On the expressiveness and trade-offs of large scale tuple stores

Authors: A. Fox
A. Java
B. Krishnamurthy
C. Olston
E. Meijer
E.A. Brewer
G. DeCandia
M. Zhong
M.P. Herlihy
P. Nadkarni
P.A. Bernstein
R. Pike
R. Vilaça
S. Ghemawat
Publication date: 1 January 2010
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Proceedings of On the Move to Meaningful Internet Systems (OTM)Massive-scale distributed computing is a challenge at our doorstep. The current exponential growth of data calls for massive-scale capabilities of storage and processing. This is being acknowledged by several major Internet players embracing the cloud computing model and offering first generation distributed tuple stores. Having all started from similar requirements, these systems ended up providing a similar service: A simple tuple store interface, that allows applications to insert, query, and remove individual elements. Further- more, while availability is commonly assumed to be sustained by the massive scale itself, data consistency and freshness is usually severely hindered. By doing so, these services focus on a specific narrow trade-off between consistency, availability, performance, scale, and migration cost, that is much less attractive to common business needs. In this paper we introduce DataDroplets, a novel tuple store that shifts the current trade-off towards the needs of common business users, pro- viding additional consistency guarantees and higher level data process- ing primitives smoothing the migration path for existing applications. We present a detailed comparison between DataDroplets and existing systems regarding their data model, architecture and trade-offs. Prelim- inary results of the system's performance under a realistic workload are also presented