Search CORE

39 research outputs found

Estimating Spectroscopic Redshifts by Using k Nearest Neighbors Regression I. Description of Method and Analysis

Author: Hoecker M.
Kügler S. D.
Polsterer K.
Publication venue: 'EDP Sciences'
Publication date: 06/03/2015
Field of study

Context: In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. While classical approaches (e.g. template fitting) are fine for objects of well-known classes, alternative techniques have to be developed to determine those that do not fit. Therefore a classification scheme should be based on individual properties instead of fitting to a global model and therefore loose valuable information. An important issue when dealing with large data sets is the outlier detection which at the moment is often treated problem-orientated. Aims: In this paper we present a method to statistically estimate the redshift z based on a similarity approach. This allows us to determine redshifts in spectra in emission as well as in absorption without using any predefined model. Additionally we show how an estimate of the redshift based on single features is possible. As a consequence we are e.g. able to filter objects which show multiple redshift components. We propose to apply this general method to all similar problems in order to identify objects where traditional approaches fail. Methods: The redshift estimation is performed by comparing predefined regions in the spectra and applying a k nearest neighbor regression model for every predefined emission and absorption region, individually. Results: We estimated a redshift for more than 50% of the analyzed 16,000 spectra of our reference and test sample. The redshift estimate yields a precision for every individually tested feature that is comparable with the overall precision of the redshifts of SDSS. In 14 spectra we find a significant shift between emission and absorption or emission and emission lines. The results show already the immense power of this simple machine learning approach for investigating huge databases such as the SDSS.Comment: accepted for publication in A&

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps

Author: Alger M. J.
Galvin T. J.
Hopkins E.
Huynh M.
Norris R. P.
Polsterer K. L.
Rudnick L.
Shabala S.
Wang X. R.
Wong O. I.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2019
Field of study

With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and introduce a proper transfer mechanism via quantile random forest regression. By using parallelized rotation and flipping invariant Kohonen-maps, image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio continuum and WISE infrared all sky surveys are first projected down to a two-dimensional embedding in an unsupervised way. This embedding can be seen as a discretised space of shapes with the coordinates reflecting morphological features as expressed by the automatically derived prototypes. We find that these prototypes have reconstructed physically meaningful processes across two channel images at radio and infrared wavelengths in an unsupervised manner. In the second step, images are compared with those prototypes to create a heat-map, which is the morphological fingerprint of each object and the basis for transferring the user generated labels. These heat-maps have reduced the feature space by a factor of 248 and are able to be used as the basis for subsequent ML methods. Using an ensemble of decision trees we achieve upwards of 85.7% and 80.7% accuracy when predicting the number of components and peaks in an image, respectively, using these heat-maps. We also question the currently used discrete classification schema and introduce a continuous scale that better reflects the uncertainty in transition between two classes, caused by sensitivity and resolution limits

arXiv.org e-Print Archive

University of Tasmania Open Access Repository

Western Sydney ResearchDirect

Bigger Buffer k-d Trees on Multi-Many-Core Systems

Author: AA Mahabal
J Bentley
J Friedman
K Polsterer
M Blum
M Scarpino
N Nakasato
PN Tan
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2019
Field of study

A buffer k-d tree is a k-d tree variant for massively-parallel nearest neighbor search. While providing valuable speed-ups on modern many-core devices in case both a large number of reference and query points are given, buffer k-d trees are limited by the amount of points that can fit on a single device. In this work, we show how to modify the original data structure and the associated workflow to make the overall approach capable of dealing with massive data sets. We further provide a simple yet efficient way of using multiple devices given in a single workstation. The applicability of the modified framework is demonstrated in the context of astronomy, a field that is faced with huge amounts of data

Crossref

Copenhagen University Research Information System

Caltech Authors

Radboud Repository (Radboud Univ.)

Cataloging the radio-sky with unsupervised machine learning: a new approach for the SKA era

Author: Galvin T. J.
Heald G. H.
Hopkins E.
Huynh M.
Norris R. P.
O'Brien A. N.
Polsterer K.
Ralph N. O.
Wang X. R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

We develop a new analysis approach towards identifying related radio components and their corresponding infrared host galaxy based on unsupervised machine learning methods. By exploiting PINK, a self-organising map algorithm, we are able to associate radio and infrared sources without the a priori requirement of training labels. We present an example of this method using

894,415

images from the FIRST and WISE surveys centred towards positions described by the FIRST catalogue. We produce a set of catalogues that complement FIRST and describe 802,646 objects, including their radio components and their corresponding AllWISE infrared host galaxy. Using these data products we (i) demonstrate the ability to identify objects with rare and unique radio morphologies (e.g. 'X'-shaped galaxies, hybrid FR-I/FR-II morphologies), (ii) can identify the potentially resolved radio components that are associated with a single infrared host and (iii) introduce a "curliness" statistic to search for bent and disturbed radio morphologies, and (iv) extract a set of 17 giant radio galaxies between 700-1100 kpc. As we require no training labels, our method can be applied to any radio-continuum survey, provided a sufficiently representative SOM can be trained

arXiv.org e-Print Archive

Western Sydney ResearchDirect

A Comparison of Photometric Redshift Techniques for Large Radio Surveys

Author: Brescia M.
Budavari T.
Carliles S.
Cavuoti S.
Farrah D.
Geach J.
Longo G.
Luken K.
Musaeva A.
Norris Ray P.
Polsterer K.
Riccio G.
Salvato M.
Seymour N.
Smolčić V.
Vaccari M.
Zinn P.
Publication venue: 'Astronomical Society of the Pacific Conference Series'
Publication date: 01/01/2019
Field of study

Future radio surveys will generate catalogs of tens of millions of radio sources, for which redshift estimates will be essential to achieve many of the science goals. However, spectroscopic data will be available for only a small fraction of these sources, and in most cases even the optical and infrared photometry will be of limited quality. Furthermore, radio sources tend to be at higher redshift than most optical sources (most radio surveys have a median redshift greater than 1) and so a significant fraction of radio sources hosts differ from those for which most photometric redshift templates are designed. We therefore need to develop new techniques for estimating the redshifts of radio sources. As a starting point in this process, we evaluate a number of machine-learning techniques for estimating redshift, together with a conventional template-fitting technique. We pay special attention to how the performance is affected by the incompleteness of the training sample and by sparseness of the parameter space or by limited availability of ancillary multiwavelength data. As expected, we find that the quality of the photometric-redshift degrades as the quality of the photometry decreases, but that even with the limited quality of photometry available for all-sky-surveys, useful redshift information is available for the majority of sources, particularly at low redshift. We find that a template-fitting technique performs best in the presence of high-quality and almost complete multi-band photometry, especially if radio sources that are also X-ray emitting are treated separately, using specific templates and priors. When we reduced the quality of photometry to match that available for the EMU all-sky radio survey, the quality of the template-fitting degraded and became comparable to some of the machine-learning methods. Machine learning techniques currently perform better at low redshift than at high redshift, because of incompleteness of the currently available training data at high redshifts

arXiv.org e-Print Archive

OA@INAF - Istituto Nazionale di Astrofisica

Caltech Authors

Western Sydney ResearchDirect

MPG.PuRe

Unveiling the rarest morphologies of the LOFAR Two-metre Sky Survey radio source population with self-organised maps

Author: Best P. N.
Brienza M.
Bruggen M.
Duncan K. J.
Hardcastle M. J.
Jurlin N.
Mingo B.
Morganti R.
Mostert R. I. J.
Polsterer K. L.
Rottgering H. J. A.
Shimwell T.
Smith D.
Williams W. L.
Publication venue: 'EDP Sciences'
Publication date: 01/01/2021
Field of study

Context. The Low Frequency Array (LOFAR) Two-metre Sky Survey (LoTSS) is a low-frequency radiocontinuum survey of the Northern sky at an unparalleled resolution and sensitivity. Aims. In order to fully exploit this huge dataset and those produced by the Square Kilometre Array in the next decade, automated methods in machine learning and data-mining will be increasingly essential both for morphological classifications and for identifying optical counterparts to the radio sources. Methods. Using self-organising maps (SOMs), a form of unsupervised machine learning, we created a dimensionality reduction of the radio morphologies for the ∼25k extended radio continuum sources in the LoTSS first data release, which is only ∼2 percent of the final LoTSS survey. We made use of PINK, a code which extends the SOM algorithm with rotation and flipping invariance, increasing its suitability and effectiveness for training on astronomical sources. Results. After training, the SOMs can be used for a wide range of science exploitation and we present an illustration of their potential by finding an arbitrary number of morphologically rare sources in our training data (424 square degrees) and subsequently in an area of the sky (∼5300 square degrees) outside the trainingdata. Objects found in this way span a wide range of morphological and physical categories: extended jets of radio active galactic nuclei, diffuse cluster haloes and relics, and nearby spiral galaxies. Finally, to enable accessible, interactive, and intuitive data exploration, we showcase the LOFAR-PyBDSF Visualisation Tool, which allows users to explore the LoTSS dataset through the trained SOMs

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A comparison of photometric redshift techniques for large radio surveys

Author: Brescia M.
Budavari T.
Carliles S.
Cavuoti S.
Farrah D.
Geach J.
Longo G.
Luken K.
Musaeva A.
Norris R. P.
Polsterer K.
Riccio G.
Salvato M.
Seymour N.
Smolcic V.
Vaccari M.
Zinn P.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2019
Field of study

Archivio della ricerca - Università degli studi di Napoli Federico II

Black Hole Mass Estimates Based on CIV are Consistent with Those Based on the Balmer Lines

Author: A. Germeroth
A. Pasquali
A. Quirrenbach
Abazajian
Ageorges
Assef
B. M. Peterson
B. Shappee
Bentz
Bentz
Botti
C. Feiz
C. J. Grier
C. S. Kochanek
C. Storz
Cappellari
D. Stern
D. Zhang
Denney
Dietrich
Dietrich
Dietrich
Dietrich
E. Falco
Fadely
Falco
Falco
Ferrarese
Gallagher
Gebhardt
Granato
Greene
Greene
Gültekin
H. Gemperlein
H. Mandel
Hopkins
Hopkins
Hopkins
J. Van Saders
Jahnke
Jiang
K. D. Denney
K. K. Madsen
K. Mogren
K. Polsterer
Kaspi
Kaspi
Kaspi
Keeton
Kelly
Kochanek
Kochanek
Koopmans
Lehár
Leighly
M. Dietrich
M. Juette
M. Kilic
M. Lehmitz
MacLeod
Marconi
Marconi
McGill
Mediavilla
Merritt
Morgan
Morgan
N. Ageorges
Netzer
Onken
P. Buschkamp
P. Martini
P. Mueller
P. Weiser
Peng
Peng
Peng
Peterson
Peterson
R. Hofmann
R. J. Assef
R. Khan
R. Lederer
R. Lenzen
R. S. Barrows
R. W. Pogge
Richards
Ross
Rusin
S. Kozłowski
S. Mathur
Schlegel
Shankar
Shemmer
Shen
Sulentic
Tonry
Tremaine
U. Mall
V. Knierim
V. Naranjo
Vanden Berk
Vestergaard
Vestergaard
W. Laun
W. Seifert
Wilkes
Wills
Wisotzki
Woo
Yip
Zu
Publication venue: 'IOP Publishing'
Publication date: 01/01/2011
Field of study

Using a sample of high-redshift lensed quasars from the CASTLES project with observed-frame ultraviolet or optical and near-infrared spectra, we have searched for possible biases between supermassive black hole (BH) mass estimates based on the CIV, Halpha and Hbeta broad emission lines. Our sample is based upon that of Greene, Peng & Ludwig, expanded with new near-IR spectroscopic observations, consistently analyzed high S/N optical spectra, and consistent continuum luminosity estimates at 5100A. We find that BH mass estimates based on the FWHM of CIV show a systematic offset with respect to those obtained from the line dispersion, sigma_l, of the same emission line, but not with those obtained from the FWHM of Halpha and Hbeta. The magnitude of the offset depends on the treatment of the HeII and FeII emission blended with CIV, but there is little scatter for any fixed measurement prescription. While we otherwise find no systematic offsets between CIV and Balmer line mass estimates, we do find that the residuals between them are strongly correlated with the ratio of the UV and optical continuum luminosities. Removing this dependency reduces the scatter between the UV- and optical-based BH mass estimates by a factor of approximately 2, from roughly 0.35 to 0.18 dex. The dispersion is smallest when comparing the CIV sigma_l mass estimate, after removing the offset from the FWHM estimates, and either Balmer line mass estimate. The correlation with the continuum slope is likely due to a combination of reddening, host contamination and object-dependent SED shapes. When we add additional heterogeneous measurements from the literature, the results are unchanged.Comment: Accepted for publication in The Astrophysical Journal. 37 text pages + 8 tables + 23 figures. Updated with comments by the referee and with a expanded discussion on literature data including new observation

arXiv.org e-Print Archive

Crossref

MPG.PuRe