1,548 research outputs found
The Role of Provenance Management in Accelerating the Rate of Astronomical Research
The availability of vast quantities of data through electronic archives has
transformed astronomical research. It has also enabled the creation of new
products, models and simulations, often from distributed input data and models,
that are themselves made electronically available. These products will only
provide maximal long-term value to astronomers when accompanied by records of
their provenance; that is, records of the data and processes used in the
creation of such products. We use the creation of image mosaics with the
Montage grid-enabled mosaic engine to emphasize the necessity of provenance
management and to understand the science requirements that higher-level
products impose on provenance management technologies. We describe experiments
with one technology, the "Provenance Aware Service Oriented Architecture"
(PASOA), that stores provenance information at each step in the computation of
a mosaic. The results inform the technical specifications of provenance
management systems, including the need for extensible systems built on common
standards. Finally, we describe examples of provenance management technology
emerging from the fields of geophysics and oceanography that have applicability
to astronomy applications.Comment: 8 pages, 1 figure; Proceedings of Science, 201
A Cost-Benefit Study of Doing Astrophysics On The Cloud: Production of Image Mosaics
Utility grids such as the Amazon EC2 and Amazon S3 clouds offer computational and storage resources that can be used on-demand for a fee by compute- and data-intensive applications. The cost of running an application on such a cloud depends on the compute, storage and communication resources it will provision and consume. Different execution plans of the same application may result in significantly different costs. We studied via simulation the cost performance trade-offs of different execution and resource provisioning plans by creating, under the Amazon cloud fee structure, mosaics with the Montage image mosaic engine, a widely used data- and compute-intensive application. Specifically, we studied the cost of building mosaics of 2MASS data that have sizes of 1, 2 and 4 square degrees, and a 2MASS all-sky mosaic. These are examples of mosaics commonly generated by astronomers. We also study these trade-offs in the context of the storage and communication fees of Amazon S3 when used for long-term application data archiving. Our results show that by provisioning the right amount of storage and compute resources cost can be significantly reduced with no significant impact on application performance
A Virtual Data Grid for LIGO
GriPhyN (Grid Physics Network) is a large US collaboration to
build grid services for large physics experiments, one of which is LIGO, a
gravitational-wave observatory. This paper explains the physics and computing
challenges of LIGO, and the tools that GriPhyN will build to address
them. A key component needed to implement the data pipeline is a virtual
data service; a system to dynamically create data products requested during
the various stages. The data could possibly be already processed in a certain
way, it may be in a file on a storage system, it may be cached, or it may need
to be created through computation. The full elaboration of this system will al-low
complex data pipelines to be set up as virtual data objects, with existing
data being transformed in diverse ways
- …