7 research outputs found

    CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop

    Get PDF
    We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port d'Informació Científica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ciències de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of cataloged information and 50×10 astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of 10 objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes.CosmoHub has been partially funded through projects of the Spanish national program “Programa Estatal de I + D + i” of the Spanish government. The support of the ERDF fund is gratefully acknowledged

    CosmoHub : Interactive exploration and distribution of astronomical data on Hadoop

    Get PDF
    We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port d'Informacio Científica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ciencies de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of catalogued information and 50 × 109 astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of 109 objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes

    CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop

    Get PDF
    We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port d'Informaci\'o Cient\'ifica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ci\`encies de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of catalogued information and 50×10950 \times 10^9 astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of 10910^9 objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes

    ECSGS Management Plan

    Get PDF
    Version 0.9 reviewed by ESA at the Euclid SGS Preliminary Requirements Review (2013) Version 1.9 reviewed by ESA at the Euclid SGS System Requirements Review (2015)The ECSGS Management Plan is focused on the following topics: ECSGS organisation, responsibilities, reporting; ECSGS costing, manpower, effort tracking; ECSGS logistic (when relevant); organisation of individual OUs and SDCs under ECSGS coordination. Sections 9 and 10 contain global and local organisation details, and the names of responsible staff. The management principles expressed in this document are a coherent extension of those described in the ECSGS Science Implementation Plan. The document is compliant with the ECSS standards, as tailored for the Euclid SGS

    The PAU Survey: background light estimation with deep learning techniques

    No full text
    In any imaging survey, measuring accurately the astronomical background light is crucial to obtain good photometry. This paper introduces BKGNET, a deep neural network to predict the background and its associated error. BKGNET has been developed for data from the Physics of the Accelerating Universe Survey (PAUS), an imaging survey using a 40 narrow-band filter camera (PAUCam). The images obtained with PAUCam are affected by scattered light: an optical effect consisting of light multiply reflected that deposits energy in specific detector regions affecting the science measurements. Fortunately, scattered light is not a random effect, but it can be predicted and corrected for. We have found that BKGNET background predictions are very robust to distorting effects, while still being statistically accurate. On average, the use of BKGnet improves the photometric flux measurements by 7 per cent and up to 20 per cent at the bright end. BKGNET also removes a systematic trend in the background error estimation with magnitude in the i band that is present with the current PAU data management method. With BKGNET, we reduce the photometric redshift outlier rate by 35 per cent for the best 20 per cent galaxies selected with a photometric quality parameter.Funding for PAUS has been provided by Durham University (via the ERC StG DEGAS-259586), ETH Zurich, Leiden University (via ERC StG ADULT-279396 and Netherlands Organisation for Scientific Research (NWO) Vici grant 639.043.512) and University College London. The PAUS participants from Spanish institutions are partially supported by MINECO under grants CSD2007-00060, AYA2015-71825, ESP2015-88861, FPA2015-68048, SEV-2016-0588, SEV-2016-0597, and MDM-2015-0509, some of which include ERDF funds from the European Union. IEEC and IFAE are partially funded by the CERCA program of the Generalitat de Catalunya. The PAU data center is hosted by the Port d’Informacio Cientifica (PIC), maintained through a collaboration of CIEMAT and IFAE, with additional support from Universitat Autonoma de Barcelona and ERDF. CosmoHub has been developed by PIC and was partially funded by the ‘Plan Estatal de Investigacion Cientifica y Tecnica y de Innovacion program of the Spanish government. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 776247. AA is supported by a Royal Society Wolfson Fellowship. MS has been supported by the National Science Centre (grant UMO2016/23/N/ST9/02963).Peer reviewe
    corecore