871 research outputs found
Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case
Astronomy has entered the big data era and Machine Learning based methods
have found widespread use in a large variety of astronomical applications. This
is demonstrated by the recent huge increase in the number of publications
making use of this new approach. The usage of machine learning methods, however
is still far from trivial and many problems still need to be solved. Using the
evaluation of photometric redshifts as a case study, we outline the main
problems and some ongoing efforts to solve them.Comment: 13 pages, 3 figures, Springer's Communications in Computer and
Information Science (CCIS), Vol. 82
Data Driven Discovery in Astrophysics
We review some aspects of the current state of data-intensive astronomy, its
methods, and some outstanding data analysis challenges. Astronomy is at the
forefront of "big data" science, with exponentially growing data volumes and
data rates, and an ever-increasing complexity, now entering the Petascale
regime. Telescopes and observatories from both ground and space, covering a
full range of wavelengths, feed the data via processing pipelines into
dedicated archives, where they can be accessed for scientific analysis. Most of
the large archives are connected through the Virtual Observatory framework,
that provides interoperability standards and services, and effectively
constitutes a global data grid of astronomy. Making discoveries in this
overabundance of data requires applications of novel, machine learning tools.
We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data
from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure
Photometric redshifts for Quasars in multi band Surveys
MLPQNA stands for Multi Layer Perceptron with Quasi Newton Algorithm and it
is a machine learning method which can be used to cope with regression and
classification problems on complex and massive data sets. In this paper we give
the formal description of the method and present the results of its application
to the evaluation of photometric redshifts for quasars. The data set used for
the experiment was obtained by merging four different surveys (SDSS, GALEX,
UKIDSS and WISE), thus covering a wide range of wavelengths from the UV to the
mid-infrared. The method is able i) to achieve a very high accuracy; ii) to
drastically reduce the number of outliers and catastrophic objects; iii) to
discriminate among parameters (or features) on the basis of their significance,
so that the number of features used for training and analysis can be optimized
in order to reduce both the computational demands and the effects of
degeneracy. The best experiment, which makes use of a selected combination of
parameters drawn from the four surveys, leads, in terms of DeltaZnorm (i.e.
(zspec-zphot)/(1+zspec)), to an average of DeltaZnorm = 0.004, a standard
deviation sigma = 0.069 and a Median Absolute Deviation MAD = 0.02 over the
whole redshift range (i.e. zspec <= 3.6), defined by the 4-survey cross-matched
spectroscopic sample. The fraction of catastrophic outliers, i.e. of objects
with photo-z deviating more than 2sigma from the spectroscopic value is < 3%,
leading to a sigma = 0.035 after their removal, over the same redshift range.
The method is made available to the community through the DAMEWARE web
application.Comment: 38 pages, Submitted to ApJ in February 2013; Accepted by ApJ in May
201
Star Formation Rates for photometric samples of galaxies using machine learning methods
Star Formation Rates or SFRs are crucial to constrain theories of galaxy
formation and evolution. SFRs are usually estimated via spectroscopic
observations requiring large amounts of telescope time. We explore an
alternative approach based on the photometric estimation of global SFRs for
large samples of galaxies, by using methods such as automatic parameter space
optimisation, and supervised Machine Learning models. We demonstrate that, with
such approach, accurate multi-band photometry allows to estimate reliable SFRs.
We also investigate how the use of photometric rather than spectroscopic
redshifts, affects the accuracy of derived global SFRs. Finally, we provide a
publicly available catalogue of SFRs for more than 27 million galaxies
extracted from the Sloan Digital Sky survey Data Release 7. The catalogue is
available through the Vizier facility at the following link
ftp://cdsarc.u-strasbg.fr/pub/cats/J/MNRAS/486/1377
AIDA, a Modular Web Application for Astronomical Data Analysis and Instrument Monitoring Services
In the last decade, Astronomy has been the scene of the realization of panchromatic surveys, with sophisticated instruments acquiring a huge quantity of exceptional quality data. This poses the need to integrate advanced data-driven science methodologies for the automatic exploration of huge data archives, and the need for efficient short- and long-term monitoring and diagnostics systems. The goal is to keep the quality of the observations under control and to detect and circumscribe anomalies and malfunctions, facilitating rapid and effective corrections, ensuring correct maintenance of all components and the good health of scientific data over time. In particular, this requirement is crucial for space-borne observation systems, both in logistical and economic terms. AIDA (Advanced Infrastructure for Data Analysis) is a portable and modular web application, designed to provide an efficient and intuitive software infrastructure to support monitoring of data acquiring systems over time, diagnostics and both scientific and engineering data quality analysis, particularly suited for astronomical instruments. Given its modular system prerogative, it is possible to extend its functionalities, by integrating and customizing monitoring and diagnostics systems, as well as scientific data analysis solutions, including machine/deep learning and data mining techniques and methods. A specialized version of AIDA has been recently appointed as focal plane instrument operation diagnostics, analytics and monitoring service within the Science Ground Segment of the Euclid space mission
Astrophysics in S.Co.P.E
S.Co.P.E. is one of the four projects funded by the Italian Government in
order to provide Southern Italy with a distributed computing infrastructure for
fundamental science. Beside being aimed at building the infrastructure,
S.Co.P.E. is also actively pursuing research in several areas among which
astrophysics and observational cosmology. We shortly summarize the most
significant results obtained in the first two years of the project and related
to the development of middleware and Data Mining tools for the Virtual
Observatory
SDSS-DR9 photometric redshifts
Accurate photometric redshifts for large samples of galaxies are among the main products of modern multiband digital surveys. Over the last decade, the Sloan Digital Sky Survey (SDSS) has become a sort of benchmark against which to test the various methods. We present an application of a new method to the estimation of photometric redshifts for the galaxies in the SDSS Data Release 9 (SDSS-DR9). Photometric redshifts for more than 143 million galaxies were produced. The MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) model provided within the framework of the DAMEWARE (DAta Mining and Exploration Web Application REsource) is an interpolative method derived from machine learning models. The obtained redshifts have an overall uncertainty of sigma=0.023 with a very small average bias of about 3x10^-5, and a fraction of catastrophic outliers of about 5%. This result is slightly better than what was already available in the literature, particularly in terms of the smaller fraction of catastrophic outliers
Data Driven Discovery in Astrophysics
We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications
A catalogue of photometric redshifts for the SDSS-DR9 galaxies
Accurate photometric redshifts for large samples of galaxies are among the main products of modern multiband digital surveys. Over the last decade, the Sloan Digital Sky Survey (SDSS) has become a sort of benchmark against which to test the various methods. We present an application of a new method to the estimation of photometric redshifts for the galaxies in the SDSS Data Release 9 (SDSS-DR9). Photometric redshifts for more than 143 million galaxies were produced and made available at the URL: http://dame.dsf.unina.it/catalog/DR9PHOTOZ/.
The MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) model provided within the framework of the DAMEWARE (DAta Mining and Exploration Web Application REsource) is an interpolative method derived from machine learning models.
The obtained redshifts have an overall uncertainty of sigma=0.023 with a very small average bias of about 3x10^-5, and a fraction of catastrophic outliers of about5%.
This result is slightly better than what was already available in the literature, also in terms of the smaller fraction of catastrophic outliers
- …