PlinyCompute: A Platform for High-Performance, Distributed,
  Data-Intensive Tool Development

Barnett, R. Matthew; Jermaine, Chris; Lorido-Botran, Tania; Luo, Shangyu; Monroy, Carlos; Sikdar, Sourav; Teymourian, Kia; Yuan, Binhang; Zou, Jia

research

PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development

Authors: R. Matthew Barnett
Chris Jermaine
Tania Lorido-Botran
Shangyu Luo
Carlos Monroy
Sourav Sikdar
Kia Teymourian
Binhang Yuan
Jia Zou
Publication date: 1 January 2017
Publisher

Abstract

This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/function dispatch to the JVM. This hybrid approach---declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small---results in a system that is ideal for the development of reusable, data-intensive tools and libraries. Through extensive benchmarking, we show that implementing complex objects manipulation and non-trivial, library-style computations on top of PlinyCompute can result in a speedup of 2x to more than 50x or more compared to equivalent implementations on Spark.Comment: 48 pages, including references and Appendi

Similar works

Full text

Available Versions

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/29261

Last time updated on 09/07/2019