Search CORE

1,707 research outputs found

MOON: MapReduce On Opportunistic eNvironments

Author: Archuleta Jeremy
Feng Wu-chun
Gardner Mark
Lin Heshan
Ma Xiaosong
Zhang Zhe
Publication venue
Publication date: 01/01/2009
Field of study

Abstract—MapReduce offers a ﬂexible programming model for processing and generating large data sets on dedicated resources, where only a small fraction of such resources are every unavailable at any given time. In contrast, when MapReduce is run on volunteer computing systems, which opportunistically harness idle desktop computers via frameworks like Condor, it results in poor performance due to the volatility of the resources, in particular, the high rate of node unavailability. Specifically, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. The adaptive task and data scheduling algorithms in MOON distinguish between (1) different types of MapReduce data and (2) different types of node outages in order to strategically place tasks and data on both volatile and dedicated nodes. Our tests demonstrate that MOON can deliver a 3-fold performance improvement to Hadoop in volatile, volunteer computing environments

Computer Science Technical Reports @Virginia Tech

Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

Author: Carra Damiano
Michiardi Pietro
Milanesio Marco
Pace Francesco
Venzano Daniele
Publication venue
Publication date: 01/01/2016
Field of study

An increasing number of Analytics-as-a-Service solutions has recently seen the light, in the landscape of cloud-based services. These services allow flexible composition of compute and storage components, that create powerful data ingestion and processing pipelines. This work is a first attempt at an experimental evaluation of analytic application performance executed using a wide range of storage service configurations. We present an intuitive notion of data locality, that we use as a proxy to rank different service compositions in terms of expected performance. Through an empirical analysis, we dissect the performance achieved by analytic workloads and unveil problems due to the impedance mismatch that arise in some configurations. Our work paves the way to a better understanding of modern cloud-based analytic services and their performance, both for its end-users and their providers.Comment: Longer version of the paper in Submission at IEEE CLOUD'1

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Scipedia