22 research outputs found
Data Access for LIGO on the OSG
During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory
(LIGO) conducted a three-month observing campaign. These observations delivered
the first direct detection of gravitational waves from binary black hole
mergers. To search for these signals, the LIGO Scientific Collaboration uses
the PyCBC search pipeline. To deliver science results in a timely manner, LIGO
collaborated with the Open Science Grid (OSG) to distribute the required
computation across a series of dedicated, opportunistic, and allocated
resources. To deliver the petabytes necessary for such a large-scale
computation, our team deployed a distributed data access infrastructure based
on the XRootD server suite and the CernVM File System (CVMFS). This data access
strategy grew from simply accessing remote storage to a POSIX-based interface
underpinned by distributed, secure caches across the OSG.Comment: 6 pages, 3 figures, submitted to PEARC1
Creating a content delivery network for general science on the internet backbone using XCaches
A general problem faced by computing on the grid for opportunistic users is
that delivering cycles is simpler than delivering data to those cycles. In this
project we show how we integrated XRootD caches placed on the internet backbone
to implement a content delivery network for general science workflows. We will
show that for some workflows on different science domains like high energy
physics, gravitational waves, and others the combination of data reuse from the
workflows together with the use of caches increases CPU efficiency while
decreasing network bandwidth use
IceCube experience using XRootD-based Origins with GPU workflows in PNRP
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope
located at the geographic South Pole. Understanding detector systematic effects
is a continuous process. This requires the Monte Carlo simulation to be updated
periodically to quantify potential changes and improvements in science results
with more detailed modeling of the systematic effects. IceCube's largest
systematic effect comes from the optical properties of the ice the detector is
embedded in. Over the last few years there have been considerable improvements
in the understanding of the ice, which require a significant processing
campaign to update the simulation. IceCube normally stores the results in a
central storage system at the University of Wisconsin-Madison, but it ran out
of disk space in 2022. The Prototype National Research Platform (PNRP) project
thus offered to provide both GPU compute and storage capacity to IceCube in
support of this activity. The storage access was provided via XRootD-based OSDF
Origins, a first for IceCube computing. We report on the overall experience
using PNRP resources, with both successes and pain points.Comment: 7 pages, 3 figures, 1 table, To be published in Proceedings of CHEP2
SciTokens: Capability-Based Secure Access to Remote Scientific Data
The management of security credentials (e.g., passwords, secret keys) for
computational science workflows is a burden for scientists and information
security officers. Problems with credentials (e.g., expiration, privilege
mismatch) cause workflows to fail to fetch needed input data or store valuable
scientific results, distracting scientists from their research by requiring
them to diagnose the problems, re-run their computations, and wait longer for
their results. In this paper, we introduce SciTokens, open source software to
help scientists manage their security credentials more reliably and securely.
We describe the SciTokens system architecture, design, and implementation
addressing use cases from the Laser Interferometer Gravitational-Wave
Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey
Telescope (LSST) projects. We also present our integration with widely-used
software that supports distributed scientific computing, including HTCondor,
CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for
capability-based secure access to remote scientific data. The access tokens
convey the specific authorizations needed by the workflows, rather than
general-purpose authentication impersonation credentials, to address the risks
of scientific workflows running on distributed infrastructure including NSF
resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds
(e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the
interoperability and security of scientific workflows, SciTokens 1) enables use
of distributed computing for scientific domains that require greater data
protection and 2) enables use of more widely distributed computing resources by
reducing the risk of credential abuse on remote systems.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
BOSS-LDG: A Novel Computational Framework that Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery
We present a novel computational framework that connects Blue Waters, the
NSF-supported, leadership-class supercomputer operated by NCSA, to the Laser
Interferometer Gravitational-Wave Observatory (LIGO) Data Grid via Open Science
Grid technology. To enable this computational infrastructure, we configured,
for the first time, a LIGO Data Grid Tier-1 Center that can submit
heterogeneous LIGO workflows using Open Science Grid facilities. In order to
enable a seamless connection between the LIGO Data Grid and Blue Waters via
Open Science Grid, we utilize Shifter to containerize LIGO's workflow software.
This work represents the first time Open Science Grid, Shifter, and Blue Waters
are unified to tackle a scientific problem and, in particular, it is the first
time a framework of this nature is used in the context of large scale
gravitational wave data analysis. This new framework has been used in the last
several weeks of LIGO's second discovery campaign to run the most
computationally demanding gravitational wave search workflows on Blue Waters,
and accelerate discovery in the emergent field of gravitational wave
astrophysics. We discuss the implications of this novel framework for a wider
ecosystem of Higher Performance Computing users.Comment: 10 pages, 10 figures. Accepted as a Full Research Paper to the 13th
IEEE International Conference on eScienc
Container solutions for HPC Systems: A Case Study of Using Shifter on Blue Waters
Software container solutions have revolutionized application development
approaches by enabling lightweight platform abstractions within the so-called
"containers." Several solutions are being actively developed in attempts to
bring the benefits of containers to high-performance computing systems with
their stringent security demands on the one hand and fundamental resource
sharing requirements on the other.
In this paper, we discuss the benefits and short-comings of such solutions
when deployed on real HPC systems and applied to production scientific
applications.We highlight use cases that are either enabled by or significantly
benefit from such solutions. We discuss the efforts by HPC system
administrators and support staff to support users of these type of workloads on
HPC systems not initially designed with these workloads in mind focusing on
NCSA's Blue Waters system.Comment: 8 pages, 7 figures, in PEARC '18: Proceedings of Practice and
Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA,
US
IceCube experience using XRootD-based Origins with GPU workflows in PNRP
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. Understanding detector systematic effects is a continuous process. This requires the Monte Carlo simulation to be updated periodically to quantify potential changes and improvements in science results with more detailed modeling of the systematic effects. IceCube’s largest systematic effect comes from the optical properties of the ice the detector is embedded in. Over the last few years there have been considerable improvements in the understanding of the ice, which require a significant processing campaign to update the simulation. IceCube normally stores the results in a central storage system at the University of Wisconsin–Madison, but it ran out of disk space in 2022. The Prototype National Research Platform (PNRP) project thus offered to provide both GPU compute and storage capacity to IceCube in support of this activity. The storage access was provided via XRootD-based OSDF Origins, a first for IceCube computing. We report on the overall experience using PNRP resources, with both successes and pain points
CRIU - Checkpoint Restore in Userspace for computational simulations and scientific applications
Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary point in time and later continue their computation on another compute resource; this is usually referred to as checkpointing. While some applications can manage checkpointing programmatically, it would be preferable if the batch scheduling system could do that independently. This paper evaluates the feasibility of using CRIU (Checkpoint Restore in Userspace), an open-source tool for the GNU/Linux environments, emphasizing the OSG’s OSPool HTCondor setup. CRIU allows checkpointing the process state into a disk image and can deal with both open files and established network connections seamlessly. Furthermore, it can checkpoint traditional Linux processes and containerized workloads. The functionality seems adequate for many scenarios supported in the OSPool. However, some limitations prevent it from being usable in all circumstances