1,482 research outputs found
Photometric redshifts for Quasars in multi band Surveys
MLPQNA stands for Multi Layer Perceptron with Quasi Newton Algorithm and it
is a machine learning method which can be used to cope with regression and
classification problems on complex and massive data sets. In this paper we give
the formal description of the method and present the results of its application
to the evaluation of photometric redshifts for quasars. The data set used for
the experiment was obtained by merging four different surveys (SDSS, GALEX,
UKIDSS and WISE), thus covering a wide range of wavelengths from the UV to the
mid-infrared. The method is able i) to achieve a very high accuracy; ii) to
drastically reduce the number of outliers and catastrophic objects; iii) to
discriminate among parameters (or features) on the basis of their significance,
so that the number of features used for training and analysis can be optimized
in order to reduce both the computational demands and the effects of
degeneracy. The best experiment, which makes use of a selected combination of
parameters drawn from the four surveys, leads, in terms of DeltaZnorm (i.e.
(zspec-zphot)/(1+zspec)), to an average of DeltaZnorm = 0.004, a standard
deviation sigma = 0.069 and a Median Absolute Deviation MAD = 0.02 over the
whole redshift range (i.e. zspec <= 3.6), defined by the 4-survey cross-matched
spectroscopic sample. The fraction of catastrophic outliers, i.e. of objects
with photo-z deviating more than 2sigma from the spectroscopic value is < 3%,
leading to a sigma = 0.035 after their removal, over the same redshift range.
The method is made available to the community through the DAMEWARE web
application.Comment: 38 pages, Submitted to ApJ in February 2013; Accepted by ApJ in May
201
Astrophysics in S.Co.P.E
S.Co.P.E. is one of the four projects funded by the Italian Government in
order to provide Southern Italy with a distributed computing infrastructure for
fundamental science. Beside being aimed at building the infrastructure,
S.Co.P.E. is also actively pursuing research in several areas among which
astrophysics and observational cosmology. We shortly summarize the most
significant results obtained in the first two years of the project and related
to the development of middleware and Data Mining tools for the Virtual
Observatory
Catalog of quasars from the Kilo-Degree Survey Data Release 3
We present a catalog of quasars selected from broad-band photometric ugri
data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are
identified by the random forest (RF) supervised machine learning model, trained
on SDSS DR14 spectroscopic data. We first cleaned the input KiDS data from
entries with excessively noisy, missing or otherwise problematic measurements.
Applying a feature importance analysis, we then tune the algorithm and identify
in the KiDS multiband catalog the 17 most useful features for the
classification, namely magnitudes, colors, magnitude ratios, and the stellarity
index. We used the t-SNE algorithm to map the multi-dimensional photometric
data onto 2D planes and compare the coverage of the training and inference
sets. We limited the inference set to r<22 to avoid extrapolation beyond the
feature space covered by training, as the SDSS spectroscopic sample is
considerably shallower than KiDS. This gives 3.4 million objects in the final
inference sample, from which the random forest identified 190,000 quasar
candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived
from a test set extracted from SDSS and not used in the training, are confirmed
by comparison with external spectroscopic and photometric QSO catalogs
overlapping with the KiDS footprint. The robustness of our results is
strengthened by number counts of the quasar candidates in the r band, as well
as by their mid-infrared colors available from WISE. An analysis of parallaxes
and proper motions of our QSO candidates found also in Gaia DR2 suggests that a
probability cut of p(QSO)>0.8 is optimal for purity, whereas p(QSO)>0.7 is
preferable for better completeness. Our study presents the first comprehensive
quasar selection from deep high-quality KiDS data and will serve as the basis
for versatile studies of the QSO population detected by this survey.Comment: Data available from the KiDS website at
http://kids.strw.leidenuniv.nl/DR3/quasarcatalog.php and the source code from
https://github.com/snakoneczny/kids-quasar
Steps towards a map of the nearby universe
We present a new analysis of the Sloan Digital Sky Survey data aimed at
producing a detailed map of the nearby (z < 0.5) universe. Using neural
networks trained on the available spectroscopic base of knowledge we derived
distance estimates for about 30 million galaxies distributed over ca. 8,000 sq.
deg. We also used unsupervised clustering tools developed in the framework of
the VO-Tech project, to investigate the possibility to understand the nature of
each object present in the field and, in particular, to produce a list of
candidate AGNs and QSOs.Comment: 3 pages, 1 figure. To appear in Nucl Phys. B, in the proceedings of
the NOW-2006 (Neutrino Oscillation Workshop - 2006), R. Fogli et al. ed
AGN automatic photometric classification
In this paper, we discuss an application of machine-learning-based methods to the identification of candidate active galactic nucleus (AGN) from optical survey data and to the automatic classification ofAGNs in broad classes. We applied four different machine-learning algorithms, namely the Multi Layer Perceptron, trained, respectively, with the Conjugate Gradient, the Scaled Conjugate Gradient, the Quasi Newton learning rules and the Support Vector Machines, Q4 to tackle the problem of the classification of emission line galaxies in different classes, mainly AGNs versus non-AGNs, obtained using optical photometry in place of the diagnostics based on line intensity ratios which are classically used in the literature. Using the same photometric features, we discuss also the behaviour of the classifiers on finer AGN classification tasks, namely Seyfert I versus Seyfert II, and Seyfert versus LINER. Furthermore, we describe the algorithms employed, the samples of spectroscopically classified galaxies used to train the algorithms, the procedure followed to select the photometric parameters and the performances of our methods in terms of multiple statistical indicators. The results of the experiments show that the application of self-adaptive data mining algorithms trained on spectroscopic data sets and applied to carefully chosen photometric parameters represents a viable alternative to the classical methods that employ time-consuming spectroscopic observations
Statistical analysis of probability density functions for photometric redshifts through the KiDS-ESO-DR3 galaxies
Despite the high accuracy of photometric redshifts (zphot) derived using
Machine Learning (ML) methods, the quantification of errors through reliable
and accurate Probability Density Functions (PDFs) is still an open problem.
First, because it is difficult to accurately assess the contribution from
different sources of errors, namely internal to the method itself and from the
photometric features defining the available parameter space. Second, because
the problem of defining a robust statistical method, always able to quantify
and qualify the PDF estimation validity, is still an open issue. We present a
comparison among PDFs obtained using three different methods on the same data
set: two ML techniques, METAPHOR (Machine-learning Estimation Tool for Accurate
PHOtometric Redshifts) and ANNz2, plus the spectral energy distribution
template fitting method, BPZ. The photometric data were extracted from the KiDS
(Kilo Degree Survey) ESO Data Release 3, while the spectroscopy was obtained
from the GAMA (Galaxy and Mass Assembly) Data Release 2. The statistical
evaluation of both individual and stacked PDFs was done through quantitative
and qualitative estimators, including a dummy PDF, useful to verify whether
different statistical estimators can correctly assess PDF quality. We conclude
that, in order to quantify the reliability and accuracy of any zphot PDF
method, a combined set of statistical estimators is required.Comment: Accepted for publication by MNRAS, 20 pages, 14 figure
- …