62 research outputs found
AstroGrid-D: Grid Technology for Astronomical Science
We present status and results of AstroGrid-D, a joint effort of
astrophysicists and computer scientists to employ grid technology for
scientific applications. AstroGrid-D provides access to a network of
distributed machines with a set of commands as well as software interfaces. It
allows simple use of computer and storage facilities and to schedule or monitor
compute tasks and data management. It is based on the Globus Toolkit middleware
(GT4). Chapter 1 describes the context which led to the demand for advanced
software solutions in Astrophysics, and we state the goals of the project. We
then present characteristic astrophysical applications that have been
implemented on AstroGrid-D in chapter 2. We describe simulations of different
complexity, compute-intensive calculations running on multiple sites, and
advanced applications for specific scientific purposes, such as a connection to
robotic telescopes. We can show from these examples how grid execution improves
e.g. the scientific workflow. Chapter 3 explains the software tools and
services that we adapted or newly developed. Section 3.1 is focused on the
administrative aspects of the infrastructure, to manage users and monitor
activity. Section 3.2 characterises the central components of our architecture:
The AstroGrid-D information service to collect and store metadata, a file
management system, the data management system, and a job manager for automatic
submission of compute tasks. We summarise the successfully established
infrastructure in chapter 4, concluding with our future plans to establish
AstroGrid-D as a platform of modern e-Astronomy.Comment: 14 pages, 12 figures Subjects: data analysis, image processing,
robotic telescopes, simulations, grid. Accepted for publication in New
Astronom
A Virtual Observatory Vision based on Publishing and Virtual Data
We would like to propose a vision of the Virtual Observatory where the "killer-app" is seen to be
generalizing and extending the idea of "publication" from the narrow meaning of peer-reviewed
journals. Here, publication ranges from private temporary storage, to group access, to public
access, through to data that supports peer-reviewed Journal papers in perpetuity. The publication
model is further extended by the possibility of Virtual Data -- where only the method of
computation is stored, not necessarily the data itself. Furthermore, virtual data products may
depend on other virtual data products, creating an implicit network of on-demand computation.
This computation may take huge resources, or it may be all within a laptop
Grids and the Virtual Observatory
We consider several projects from astronomy that benefit from the Grid paradigm and
associated technology, many of which involve either massive datasets or the federation
of multiple datasets. We cover image computation (mosaicking, multi-wavelength
images, and synoptic surveys); database computation (representation through XML,
data mining, and visualization); and semantic interoperability (publishing, ontologies,
directories, and service descriptions)
Mining Knowledge in Astrophysical Massive Data Sets
Modern scientific data mainly consist of huge datasets gathered by a very
large number of techniques and stored in very diversified and often
incompatible data repositories. More in general, in the e-science environment,
it is considered as a critical and urgent requirement to integrate services
across distributed, heterogeneous, dynamic "virtual organizations" formed by
different resources within a single enterprise. In the last decade, Astronomy
has become an immensely data rich field due to the evolution of detectors
(plates to digital to mosaics), telescopes and space instruments. The Virtual
Observatory approach consists into the federation under common standards of all
astronomical archives available worldwide, as well as data analysis, data
mining and data exploration applications. The main drive behind such effort
being that once the infrastructure will be completed, it will allow a new type
of multi-wavelength, multi-epoch science which can only be barely imagined.
Data Mining, or Knowledge Discovery in Databases, while being the main
methodology to extract the scientific information contained in such MDS
(Massive Data Sets), poses crucial problems since it has to orchestrate complex
problems posed by transparent access to different computing environments,
scalability of algorithms, reusability of resources, etc. In the present paper
we summarize the present status of the MDS in the Virtual Observatory and what
is currently done and planned to bring advanced Data Mining methodologies in
the case of the DAME (DAta Mining & Exploration) project.Comment: Pages 845-849 1rs International Conference on Frontiers in
Diagnostics Technologie
Interactive 3D visualization for theoretical Virtual Observatories
Virtual Observatories (VOs) are online hubs of scientific knowledge. They
encompass a collection of platforms dedicated to the storage and dissemination
of astronomical data, from simple data archives to e-research platforms
offering advanced tools for data exploration and analysis. Whilst the more
mature platforms within VOs primarily serve the observational community, there
are also services fulfilling a similar role for theoretical data. Scientific
visualization can be an effective tool for analysis and exploration of datasets
made accessible through web platforms for theoretical data, which often contain
spatial dimensions and properties inherently suitable for visualization via
e.g. mock imaging in 2d or volume rendering in 3d. We analyze the current state
of 3d visualization for big theoretical astronomical datasets through
scientific web portals and virtual observatory services. We discuss some of the
challenges for interactive 3d visualization and how it can augment the workflow
of users in a virtual observatory context. Finally we showcase a lightweight
client-server visualization tool for particle-based datasets allowing
quantitative visualization via data filtering, highlighting two example use
cases within the Theoretical Astrophysical Observatory.Comment: 10 Pages, 13 Figures, Accepted for Publication in Monthly Notices of
the Royal Astronomical Societ
The Virtual Astronomical Observatory: Re-engineering access to astronomical data
The US Virtual Astronomical Observatory was a software infrastructure and development project designed both to begin the establishment of an operational Virtual Observatory (VO) and to provide the US coordination with the international VO effort. The concept of the VO is to provide the means by which an astronomer is able to discover, access, and process data seamlessly, regardless of its physical location. This paper describes the origins of the VAO, including the predecessor efforts within the US National Virtual Observatory, and summarizes its main accomplishments. These accomplishments include the development of both scripting toolkits that allow scientists to incorporate VO data directly into their reduction and analysis environments and high-level science applications for data discovery, integration, analysis, and catalog cross-comparison. Working with the international community, and based on the experience from the software development, the VAO was a major contributor to international standards within the International Virtual Observatory Alliance. The VAO also demonstrated how an operational virtual observatory could be deployed, providing a robust operational environment in which VO services worldwide were routinely checked for aliveness and compliance with international standards. Finally, the VAO engaged in community outreach, developing a comprehensive web site with on-line tutorials, announcements, links to both US and internationally developed tools and services, and exhibits and hands-on training at annual meetings of the American Astronomical Society and through summer schools and community days. All digital products of the VAO Project, including software, documentation, and tutorials, are stored in a repository for community access. The enduring legacy of the VAO is an increasing expectation that new telescopes and facilities incorporate VO capabilities during the design of their data management systems
Cyber-infrastructure to Support Science and Data Management for the Dark Energy Survey
The Dark Energy Survey (DES; operations 2009-2015) will address the nature of
dark energy using four independent and complementary techniques: (1) a galaxy
cluster survey over 4000 deg2 in collaboration with the South Pole Telescope
Sunyaev-Zel'dovich effect mapping experiment, (2) a cosmic shear measurement
over 5000 deg2, (3) a galaxy angular clustering measurement within redshift
shells to redshift=1.35, and (4) distance measurements to 1900 supernovae Ia.
The DES will produce 200 TB of raw data in four bands, These data will be
processed into science ready images and catalogs and co-added into deeper,
higher quality images and catalogs. In total, the DES dataset will exceed 1 PB,
including a 100 TB catalog database that will serve as a key science analysis
tool for the astronomy/cosmology community. The data rate, volume, and duration
of the survey require a new type of data management (DM) system that (1) offers
a high degree of automation and robustness and (2) leverages the existing high
performance computing infrastructure to meet the project's DM targets. The DES
DM system consists of (1) a grid-enabled, flexible and scalable middleware
developed at NCSA for the broader scientific community, (2) astronomy modules
that build upon community software, and (3) a DES archive to support automated
processing and to serve DES catalogs and images to the collaboration and the
public. In the recent DES Data Challenge 1 we deployed and tested the first
version of the DES DM system, successfully reducing 700 GB of raw simulated
images into 5 TB of reduced data products and cataloguing 50 million objects
with calibrated astrometry and photometry.Comment: 12 pages, 3 color figures, 1 table. Published in SPIE vol. 627
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
- …