Search CORE

10 research outputs found

Automated, Reliable, and Efficient Continental-Scale Replication of 7.3 Petabytes of Climate Simulation Data: A Case Study

Author: Ames Sasha
Chard Kyle
Dart Eli
Durack Paul
Foster Ian T.
Harr Cameron
Hoffman Forrest M.
Lacinski Lukasz
Liming Lee
Turoscy Steven
Publication venue
Publication date: 30/04/2024
Field of study

We report on our experiences replicating 7.3 petabytes (PB) of Earth System Grid Federation (ESGF) climate simulation data from Lawrence Livermore National Laboratory (LLNL) in California to Argonne National Laboratory (ANL) in Illinois and Oak Ridge National Laboratory (ORNL) in Tennessee. This movement of some 29 million files, twice, undertaken in order to establish new ESGF nodes at ANL and ORNL, was performed largely automatically by a simple replication tool, a script that invoked Globus to transfer large bundles of files while tracking progress in a database. Under the covers, Globus organized transfers to make efficient use of the high-speed Energy Sciences network (ESnet) and the data transfer nodes deployed at participating sites, and also addressed security, integrity checking, and recovery from a variety of transient failures. This success demonstrates the considerable benefits that can accrue from the adoption of performant data replication infrastructure

arXiv.org e-Print Archive

Homocysteinethiolactone and Paraoxonase: Novel markers of diabetic retinopathy

Author: A. Pasupathi
Aydemir
C. Muralidharan
Clarke
Coral
Devasagayam
Gu
Harel
Izuta
Jakubowski
Joseph
Koracevic
Lacinski
Lembessis
M. Dhupper
M. Sivashanmugham
Mackness
Mendes
N. Angayarkanni
R. Pukraj
Reddy
S. Barathi
S. K. Natarajan
Scholl
Shiner
Sonoki
T. Velpandian
Thiersch
Publication venue: American Diabetes Association
Publication date
Field of study

Crossref

PubMed Central

Parsl

Author: Babuji Yadu
Chard Kyle
Chard Ryan
Clifford Ben
Foster Ian
Katz Daniel S.
Kumar Rohan
Lacinski Lukasz
Li Zhuozhao
Wilde Michael
Woodard Anna
Wozniak Justin M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/05/2019
Field of study

High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore's law), necessitates rethinking how parallelism is expressed in programs. Here, we present Parsl, a parallel scripting library that augments Python with simple, scalable, and flexible constructs for encoding parallelism. These constructs allow Parsl to construct a dynamic dependency graph of components that it can then execute efficiently on one or many processors. Parsl is designed for scalability, with an extensible set of executors tailored to different use cases, such as low-latency, high-throughput, or extreme-scale execution. We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250 000 workers across more than 8000 nodes, and process upward of 1200 tasks per second. Other Parsl features simplify the construction and execution of composite programs by supporting elastic provisioning and scaling of infrastructure, fault-tolerant execution, and integrated wide-area data management. We show that these capabilities satisfy the needs of many-task, interactive, online, and machine learning applications in fields such as biology, cosmology, and materials science

arXiv.org e-Print Archive

Crossref