25 research outputs found
High-Level Design of a Data Carousel for the Basic Fusion Files
Sometimes data is large enough that the resources needed to merely hold the data can severely strain budgets. When resource constraints are severe, and the alternative is not having access to the data at all, an alternative is to 1) use a cheaper storage solution and 2) mitigate any problems that arise from the use of this type of storage. 3) deal with the restrictions that are present in the solution. We present a white paper based on limited prototyping, reflecting our current thinking on the high-level design and operational model using the Data Carousel Access pattern, applied in the context of Amazon Web services, for the 2.4 PB Basic Fusion Dataset.Ope
Local v.s. AWS provisioning: Experience fusing a monthâs data on AWS and local provisioning
The Terra ACCESS project provides enhanced access via fused data from all instruments on the NASA TERRA Earth science satellite. The fused data set is 2.4 PB in size and covers the period 2000 - 2015. This document is a technical report from early 2019, comparing the benefits and costs of performing the data fusion on Amazon Web Services and the Illinois campus cluster.NASA Award NNX16AM07AOpe
Survey form and methods for second CASC survey of academic research computing and data center usage
Full analysis paper is in the ACM PEARC'21 Proceedings: ACM ISBN 978-1-4503-8292-2/21/07.
https://doi.org/10.1145/3437359.3465589Availability of cloud-based resource delivery modes is transforming many areas of computing. Many academic institutions that support research computing facilities are considering and changing their mix of on-premise and remote facilities (including in particular use of commercial cloud facilities). A working group of the Coalition for Academic Scientific Computation (an educational nonprofit 501(c)(3) organization) has conducted an annual survey of higher education institutions now for two years running, with intentions of continuing. This survey asks a number of questions of academic institutions regarding their investments in research and data-oriented computing facilities, the extent of those facilities, and institutional activities. This technical report includes the full text of the survey instrument itself and describes the methods and survey population.http://deepblue.lib.umich.edu/bitstream/2027.42/167731/1/CASC 2021 Survey Methods and Form.pdfDescription of CASC 2021 Survey Methods and Form.pdf : Survey form and methods to accompany the full publication in the ACM PEARC'21 Proceedings, ACM ISBN 978-1-4503-8292-2/21/07. https://doi.org/10.1145/3437359.3465589SEL
LSST: from Science Drivers to Reference Design and Anticipated Data Products
(Abridged) We describe here the most ambitious survey currently planned in
the optical, the Large Synoptic Survey Telescope (LSST). A vast array of
science will be enabled by a single wide-deep-fast sky survey, and LSST will
have unique survey capability in the faint time domain. The LSST design is
driven by four main science themes: probing dark energy and dark matter, taking
an inventory of the Solar System, exploring the transient optical sky, and
mapping the Milky Way. LSST will be a wide-field ground-based system sited at
Cerro Pach\'{o}n in northern Chile. The telescope will have an 8.4 m (6.5 m
effective) primary mirror, a 9.6 deg field of view, and a 3.2 Gigapixel
camera. The standard observing sequence will consist of pairs of 15-second
exposures in a given field, with two such visits in each pointing in a given
night. With these repeats, the LSST system is capable of imaging about 10,000
square degrees of sky in a single filter in three nights. The typical 5
point-source depth in a single visit in will be (AB). The
project is in the construction phase and will begin regular survey operations
by 2022. The survey area will be contained within 30,000 deg with
, and will be imaged multiple times in six bands, ,
covering the wavelength range 320--1050 nm. About 90\% of the observing time
will be devoted to a deep-wide-fast survey mode which will uniformly observe a
18,000 deg region about 800 times (summed over all six bands) during the
anticipated 10 years of operations, and yield a coadded map to . The
remaining 10\% of the observing time will be allocated to projects such as a
Very Deep and Fast time domain survey. The goal is to make LSST data products,
including a relational database of about 32 trillion observations of 40 billion
objects, available to the public and scientists around the world.Comment: 57 pages, 32 color figures, version with high-resolution figures
available from https://www.lsst.org/overvie
Enabling real-time multi-messenger astrophysics discoveries with deep learning
Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics
The Sloan Digital Sky Survey: Technical Summary
The Sloan Digital Sky Survey (SDSS) will provide the data to support detailed
investigations of the distribution of luminous and non- luminous matter in the
Universe: a photometrically and astrometrically calibrated digital imaging
survey of pi steradians above about Galactic latitude 30 degrees in five broad
optical bands to a depth of g' about 23 magnitudes, and a spectroscopic survey
of the approximately one million brightest galaxies and 10^5 brightest quasars
found in the photometric object catalog produced by the imaging survey. This
paper summarizes the observational parameters and data products of the SDSS,
and serves as an introduction to extensive technical on-line documentation.Comment: 9 pages, 7 figures, AAS Latex. To appear in AJ, Sept 200
Recommended from our members
The Fermilab data storage infrastructure
Fermilab, in collaboration with the DESY laboratory in Hamburg, Germany, has created a petabyte scale data storage infrastructure to meet the requirements of experiments to store and access large data sets. The Fermilab data storage infrastructure consists of the following major storage and data transfer components: Enstore mass storage system, DCache distributed data cache, ftp and Grid ftp for primarily external data transfers. This infrastructure provides a data throughput sufficient for transferring data from experiments' data acquisition systems. It also allows access to data in the Grid framework