Search CORE

38 research outputs found

Reducing data movement costs using energy-efficient, active computation on ssd

Author: Devesh Tiwari
Peter J Desnoyers
Simona Boboila
Sudharshan S Vazhkudai
Xiaosong Ma
Youngjae Kim
Publication venue
Publication date: 01/01/2012
Field of study

ABSTRACT Modern scientific discovery often involves running complex application simulations on supercomputers, followed by a sequence of data analysis tasks on smaller clusters. This offline approach suffers from significant data movement costs such as redundant I/O, storage bandwidth bottleneck, and wasted CPU cycles, all of which contribute to increased energy consumption and delayed end-toend performance. Technology projections for an exascale machine indicate that energy-efficiency will become the primary design metric. It is estimated that the energy cost of data movement will soon rival the cost of computation. Consequently, we can no longer ignore the data movement costs in data analysis. To address these challenges, we advocate executing data analysis tasks on emerging storage devices, such as SSDs. Typically, in extreme-scale systems, SSDs serve only as a temporary storage system for the simulation output data. In our approach, Active Flash, we propose to conduct in-situ data analysis on the SSD controller without degrading the performance of the simulation job. By migrating analysis tasks closer to where the data resides, it helps reduce the data movement cost. We present detailed energy and performance models for both active flash and offline strategies, and study them using extreme-scale application simulations, commonly used data analytics kernels, and supercomputer system configurations. Our evaluation suggests that active flash is a promising approach to alleviate the storage bandwidth bottleneck, reduce the data movement cost, and improve the overall energy efficiency

CiteSeerX

High Performance Computing Facility Operational Assessment, 2012 Oak Ridge Leadership Computing Facility

Author: Barker Ashley D
Bernholdt David E
Bland Arthur S Buddy
Hack James J
Hudson Douglas L
Messer Bronson
Rogers James H
Thach Kevin G
Vazhkudai Sudharshan S
Wells Jack C
White Julia C
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/02/2012
Field of study

Crossref

UNT Digital Library

Timely offloading of result-data in HPC centers

Author: Ali R. Butt
Henry M. Monti
Sudharshan S. Vazhkudai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

High performance computing is facing an exponential growth in job output dataset sizes. This implies a significant commitment of supercomputing center resources—most notably, precious scratch space—in handling data staging and offloading. However, the scratch area is typically managed using simple “purge policies”, without sophisticated “end-user data services ” that are required to balance center’s resource consumption and user serviceability. End-user data services such as offloading are performed using point-to-point transfers that are unable to reconcile center’s purge and users delivery deadlines, unable to adapt to changing dynamics in the end-to-end data path and are not fault-tolerant. We propose a robust framework for the timely, decentralized offload of result data, addressing the aforementioned significant gaps in extant direct-transfer-based offloading. The decentralized offload is achieved using an overlay of user-specified intermediate nodes and well known landmark nodes. These nodes serve as a means both to provide multiple data-flow paths, thereby maximizing bandwidth as well as provide fail-over capabilities for the offload. We have implemented our techniques within a production job scheduler (PBS) and data transfer tool (BitTorrent), and our evaluation shows that the offloading times can be significantly reduced (90.2 % for a 2.1 GB file), while also meeting centeruse

CiteSeerX

Crossref

On Timely Staging of HPC Job Input Data

Author: Ali R. Butt
Henry M. Monti
Sudharshan S. Vazhkudai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref