2,743 research outputs found
The design strategy of scientific data quality control software for Euclid mission
The most valuable asset of a space mission like Euclid are the data. Due
to their huge volume, the automatic quality control becomes a crucial aspect over the
entire lifetime of the experiment. Here we focus on the design strategy for the Science
Ground Segment (SGS) Data Quality Common Tools (DQCT), which has the main role
to provide software solutions to gather, evaluate, and record quality information about
the raw and derived data products from a primarily scientific perspective. The stakeholders
for this system include Consortium scientists, users of the science data, and
the ground segment data management system itself. The SGS DQCT will provide a
quantitative basis for evaluating the application of reduction and calibration reference
data (flat-fields, linearity correction, reference catalogs, etc.), as well as diagnostic tools
for quality parameters, flags, trend analysis diagrams and any other metadata parameter
produced by the pipeline, collected in incremental quality reports specific to each
data level and stored on the Euclid Archive during pipeline processing. In a large programme
like Euclid, it is prohibitively expensive to process large amount of data at
the pixel level just for the purpose of quality evaluation. Thus, all measures of quality
at the pixel level are implemented in the individual pipeline stages, and passed along
as metadata in the production. In this sense most of the tasks related to science data
quality are delegated to the pipeline stages, even though the responsibility for science
data quality is managed at a higher level. The DQCT subsystem of the SGS is currently
under development, but its path to full realization will likely be different than that of
other subsystems. Primarily because, due to a high level of parallelism and to the wide
pipeline processing redundancy, for instance the mechanism of double Science Data
Center for each processing function, the data quality tools have not only to be widely
spread over all pipeline segments and data levels, but also to minimize the occurrences
of potential diversity of solutions implemented for similar functions, ensuring the maximum
of coherency and standardization for quality evaluation and reporting in the SGS
The design strategy of scientific data quality control software for Euclid mission
The most valuable asset of a space mission like Euclid are the data. Due
to their huge volume, the automatic quality control becomes a crucial aspect over the
entire lifetime of the experiment. Here we focus on the design strategy for the Science
Ground Segment (SGS) Data Quality Common Tools (DQCT), which has the main role
to provide software solutions to gather, evaluate, and record quality information about
the raw and derived data products from a primarily scientific perspective. The stakeholders
for this system include Consortium scientists, users of the science data, and
the ground segment data management system itself. The SGS DQCT will provide a
quantitative basis for evaluating the application of reduction and calibration reference
data (flat-fields, linearity correction, reference catalogs, etc.), as well as diagnostic tools
for quality parameters, flags, trend analysis diagrams and any other metadata parameter
produced by the pipeline, collected in incremental quality reports specific to each
data level and stored on the Euclid Archive during pipeline processing. In a large programme
like Euclid, it is prohibitively expensive to process large amount of data at
the pixel level just for the purpose of quality evaluation. Thus, all measures of quality
at the pixel level are implemented in the individual pipeline stages, and passed along
as metadata in the production. In this sense most of the tasks related to science data
quality are delegated to the pipeline stages, even though the responsibility for science
data quality is managed at a higher level. The DQCT subsystem of the SGS is currently
under development, but its path to full realization will likely be different than that of
other subsystems. Primarily because, due to a high level of parallelism and to the wide
pipeline processing redundancy, for instance the mechanism of double Science Data
Center for each processing function, the data quality tools have not only to be widely
spread over all pipeline segments and data levels, but also to minimize the occurrences
of potential diversity of solutions implemented for similar functions, ensuring the maximum
of coherency and standardization for quality evaluation and reporting in the SGS
The Design Strategy of Scientific Data Quality Control Software for Euclid Mission
The most valuable asset of a space mission like Euclid are the data. Due to their huge volume, the automatic quality control becomes a crucial aspect over the entire lifetime of the experiment. Here we focus on the design strategy for the Science Ground Segment (SGS) Data Quality Common Tools (DQCT), which has the main role to provide software solutions to gather, evaluate, and record quality information about the raw and derived data products from a primarily scientific perspective. The stake-holders for this system include Consortium scientists, users of the science data, and the ground segment data management system itself. The SGS DQCT will provide a quantitative basis for evaluating the application of reduction and calibration reference data (flat-fields, linearity correction, reference catalogs, etc.), as well as diagnostic tools for quality parameters, flags, trend analysis diagrams and any other metadata parameter produced by the pipeline, collected in incremental quality reports specific to each data level and stored on the Euclid Archive during pipeline processing. In a large programme like Euclid, it is prohibitively expensive to process large amount of data at the pixel level just for the purpose of quality evaluation. Thus, all measures of quality at the pixel level are implemented in the individual pipeline stages, and passed along as metadata in the production. In this sense most of the tasks related to science data quality are delegated to the pipeline stages, even though the responsibility for science data quality is managed at a higher level. The DQCT subsystem of the SGS is currently under development, but its path to full realization will likely be different than that of other subsystem; primarily because, due to a high level of parallelism and to the wide pipeline processing redundancy (for instance the mechanism of double Science Data Center for each processing function) the data quality tools have not only to be widely spread over all pipeline segments and data levels, but also to minimize the occurrences of potential diversity of solutions implemented for similar functions, ensuring the maximum of coherency and standardization for quality evaluation and reporting in the SGS.Peer reviewe
The Design Strategy of Scientific Data Quality Control Software for Euclid Mission
The most valuable asset of a space mission like Euclid are the data. Due to their huge volume, the automatic quality control becomes a crucial aspect over the entire lifetime of the experiment. Here we focus on the design strategy for the Science Ground Segment (SGS) Data Quality Common Tools (DQCT), which has the main role to provide software solutions to gather, evaluate, and record quality information about the raw and derived data products from a primarily scientific perspective. The stakeholders for this system include Consortium scientists, users of the science data, and the ground segment data management system itself. The SGS DQCT will provide a quantitative basis for evaluating the application of reduction and calibration reference data (flat-fields, linearity correction, reference catalogs, etc.), as well as diagnostic tools for quality parameters, flags, trend analysis diagrams and any other metadata parameter produced by the pipeline, collected in incremental quality reports specific to each data level and stored on the Euclid Archive during pipeline processing. In a large programme like Euclid, it is prohibitively expensive to process large amount of data at the pixel level just for the purpose of quality evaluation. Thus, all measures of quality at the pixel level are implemented in the individual pipeline stages, and passed along as metadata in the production. In this sense most of the tasks related to science data quality are delegated to the pipeline stages, even though the responsibility for science data quality is managed at a higher level. The DQCT subsystem of the SGS is currently under development, but its path to full realization will likely be different than that of other subsystem; primarily because, due to a high level of parallelism and to the wide pipeline processing redundancy (for instance the mechanism of double Science Data Center for each processing function) the data quality tools have not only to be widely spread over all pipeline segments and data levels, but also to minimize the occurrences of potential diversity of solutions implemented for similar functions, ensuring the maximum of coherency and standardization for quality evaluation and reporting in the SGS
Target and (Astro-)WISE technologies - Data federations and its applications
After its first implementation in 2003 the Astro-WISE technology has been
rolled out in several European countries and is used for the production of the
KiDS survey data. In the multi-disciplinary Target initiative this technology,
nicknamed WISE technology, has been further applied to a large number of
projects. Here, we highlight the data handling of other astronomical
applications, such as VLT-MUSE and LOFAR, together with some non-astronomical
applications such as the medical projects Lifelines and GLIMPS, the MONK
handwritten text recognition system, and business applications, by amongst
others, the Target Holding. We describe some of the most important lessons
learned and describe the application of the data-centric WISE type of approach
to the Science Ground Segment of the Euclid satellite.Comment: 9 pages, 5 figures, Proceedngs IAU Symposium No 325 Astroinformatics
201
CosmoDM and its application to Pan-STARRS data
The Cosmology Data Management system (CosmoDM) is an automated and flexible
data management system for the processing and calibration of data from optical
photometric surveys. It is designed to run on supercomputers and to minimize
disk I/O to enable scaling to very high throughput during periods of
reprocessing. It serves as an early prototype for one element of the
ground-based processing required by the Euclid mission and will also be
employed in the preparation of ground based data needed in the eROSITA X-ray
all sky survey mission. CosmoDM consists of two main pipelines. The first is
the single-epoch or detrending pipeline, which is used to carry out the
photometric and astrometric calibration of raw exposures. The second is the co-
addition pipeline, which combines the data from individual exposures into
deeper coadd images and science ready catalogs. A novel feature of CosmoDM is
that it uses a modified stack of As- tromatic software which can read and write
tile compressed images. Since 2011, CosmoDM has been used to process data from
the DECam, the CFHT MegaCam and the Pan-STARRS cameras. In this paper we shall
describe how processed Pan-STARRS data from CosmoDM has been used to optically
confirm and measure photometric redshifts of Planck-based Sunyaev-Zeldovich
effect selected cluster candidates.Comment: 11 pages, 4 figures. Proceedings of Precision Astronomy with Fully
Depleted CCDs Workshop (2014). Accepted for publication in JINS
Management of the science ground segment for the Euclid mission
Euclid is an ESA mission aimed at understanding the nature of dark energy and dark matter by using simultaneously two probes (weak lensing and baryon acoustic oscillations). The mission will observe galaxies and clusters of galaxies out to z~2, in a wide extra-galactic survey covering 15000 deg2, plus a deep survey covering an area of 40 deg\ub2. The payload is composed of two instruments, an imager in the visible domain (VIS) and an imager-spectrometer (NISP) covering the near-infrared. The launch is planned in Q4 of 2020. The elements of the Euclid Science Ground Segment (SGS) are the Science Operations Centre (SOC) operated by ESA and nine Science Data Centres (SDCs) in charge of data processing, provided by the Euclid Consortium (EC), formed by over 110 institutes spread in 15 countries. SOC and the EC started several years ago a tight collaboration in order to design and develop a single, cost-efficient and truly integrated SGS. The distributed nature, the size of the data set, and the needed accuracy of the results are the main challenges expected in the design and implementation of the SGS. In particular, the huge volume of data (not only Euclid data but also ground based data) to be processed in the SDCs will require distributed storage to avoid data migration across SDCs. This paper describes the management challenges that the Euclid SGS is facing while dealing with such complexity. The main aspect is related to the organisation of a geographically distributed software development team. In principle algorithms and code is developed in a large number of institutes, while data is actually processed at fewer centers (the national SDCs) where the operational computational infrastructures are maintained. The software produced for data handling, processing and analysis is built within a common development environment defined by the SGS System Team, common to SOC and ECSGS, which has already been active for several years. The code is built incrementally through different levels of maturity, going from prototypes (developed mainly by scientists) to production code (engineered and tested at the SDCs). A number of incremental challenges (infrastructure, data processing and integrated) have been included in the Euclid SGS test plan to verify the correctness and accuracy of the developed systems
- …