615 research outputs found
Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps
With the advent of large scale surveys the manual analysis and classification
of individual radio source morphologies is rendered impossible as existing
approaches do not scale. The analysis of complex morphological features in the
spatial domain is a particularly important task. Here we discuss the challenges
of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project
and introduce a proper transfer mechanism via quantile random forest
regression. By using parallelized rotation and flipping invariant Kohonen-maps,
image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio
continuum and WISE infrared all sky surveys are first projected down to a
two-dimensional embedding in an unsupervised way. This embedding can be seen as
a discretised space of shapes with the coordinates reflecting morphological
features as expressed by the automatically derived prototypes. We find that
these prototypes have reconstructed physically meaningful processes across two
channel images at radio and infrared wavelengths in an unsupervised manner. In
the second step, images are compared with those prototypes to create a
heat-map, which is the morphological fingerprint of each object and the basis
for transferring the user generated labels. These heat-maps have reduced the
feature space by a factor of 248 and are able to be used as the basis for
subsequent ML methods. Using an ensemble of decision trees we achieve upwards
of 85.7% and 80.7% accuracy when predicting the number of components and peaks
in an image, respectively, using these heat-maps. We also question the
currently used discrete classification schema and introduce a continuous scale
that better reflects the uncertainty in transition between two classes, caused
by sensitivity and resolution limits
Galaxy types in the Sloan Digital Sky Survey using supervised artificial neural networks
Supervised artificial neural networks are used to predict useful properties of galaxies in the Sloan Digital Sky Survey, in this instance morphological classifications, spectral types and redshifts. By giving the trained networks unseen data, it is found that correlations between predicted and actual properties are around 0.9 with rms errors of order ten per cent. Thus, given a representative training set, these properties may be reliably estimated for galaxies in the survey for which there are no spectra and without human intervention
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
Morphology Classification and Photometric Redshift Measurement of Galaxies
Based on the Sloan Digital Sky Survey Data Release 5 Galaxy Sample, we
explore photometric morphology classification and redshift estimation of
galaxies using photometric data and known spectroscopic redshifts. An
unsupervised method, k-means algorithm, is used to separate the whole galaxy
sample into early- and late-type galaxies. Then we investigate the photometric
redshift measurement with different input patterns by means of artificial
neural networks (ANNs) for the total sample and the two subsamples. The
experimental result indicates that ANNs show better performance when the more
parameters are applied in the training set, and the mixed accuracy
of photometric
redshift estimation for the two subsets is superior to for the
overall sample alone. For the optimal result, the rms deviation of photometric
redshifts for the mixed sample amounts to 0.0192, that for the overall sample
is 0.0196, meanwhile, that for early- and late-type galaxies adds up to 0.0164
and 0.0217, respectively.Comment: The paper contains 8 figures and 2 tables. Accepted by MNRA
- …