Search CORE

12 research outputs found

Towards Loosely-Coupled Programming on Petascale Systems

Author: Beckman Pete
Clifford Ben
Foster Ian
Iskra Kamil
Raicu Ioan
Wilde Mike
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/08/2008
Field of study

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.Comment: IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing/SC) 200

arXiv.org e-Print Archive

Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks

Author: Ananthanarayanan G.
Ananthanarayanan G.
Baker J.
Bent J.
Dean J.
Elnozahy E.
Gunda P. K.
Guo P. J.
Nightingale E. B.
Plank J. S.
Power R.
Weil S. A.
Yu Y.
Zaharia M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Tachyon is a distributed file system enabling reliable data sharing at memory speed across cluster computing frameworks. While caching today improves read workloads, writes are either network or disk bound, as replication is used for fault-tolerance. Tachyon eliminates this bottleneck by pushing lineage, a well-known technique, into the storage layer. The key challenge in making a long-running lineage-based storage system is timely data recovery in case of failures. Tachyon addresses this issue by introducing a checkpointing algorithm that guarantees bounded recovery cost and resource allocation strategies for recomputation under commonly used resource schedulers. Our evaluation shows that Tachyon outperforms in-memory HDFS by 110x for writes. It also improves the end-to-end latency of a realistic workflow by 4x. Tachyon is open source and is deployed at multiple companies.National Science Foundation (U.S.) (CISE Expeditions Award CCF-1139158)Lawrence Berkeley National Laboratory (Award 7076018)United States. Defense Advanced Research Projects Agency (XData Award FA8750-12-2-0331

CiteSeerX

Comprehensive and Practical Policy Compliance in Data Retrieval Systems

Author: Elnikety Eslam
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

Data retrieval systems such as online search engines and online social networks process many data items coming from different sources, each subject to its own data use policy. Ensuring compliance with these policies in a large and fast-evolving system presents a significant technical challenge since bugs, misconfigurations, or operator errors can cause (accidental) policy violations. To prevent such violations, researchers and practitioners develop policy compliance systems. Existing policy compliance systems, however, are either not comprehensive or not practical. To be comprehensive, a compliance system must be able to enforce users' policies regarding their personal privacy preferences, the service provider's own policies regarding data use such as auditing and personalization, and regulatory policies such as data retention and censorship. To be practical, a compliance system needs to meet stringent requirements: (1) runtime overhead must be low; (2) existing applications must run with few modifications; and (3) bugs, misconfigurations, or actions by unprivileged operators must not cause policy violations. In this thesis, we present the design and implementation of two comprehensive and practical compliance systems: Thoth and Shai. Thoth relies on pure runtime monitoring: it tracks data flows by intercepting processes' I/O, and then it checks the associated policies to allow only policy-compliant flows at runtime. Shai, on the other hand, combines offline analysis and light-weight runtime monitoring: it pushes as many policy checks as possible to an offline (flow) analysis by predicting the policies that data-handling processes will be subject to at runtime, and then it compiles those policies into a set of fine-grained I/O capabilities that can be enforced directly by the underlying operating system