877 research outputs found
Machine Learning Classification of SDSS Transient Survey Images
We show that multiple machine learning algorithms can match human performance
in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS)
supernova survey into real objects and artefacts. This is a first step in any
transient science pipeline and is currently still done by humans, but future
surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate
fully machine-enabled solutions. Using features trained from eigenimage
analysis (principal component analysis, PCA) of single-epoch g, r and
i-difference images, we can reach a completeness (recall) of 96 per cent, while
only incorrectly classifying at most 18 per cent of artefacts as real objects,
corresponding to a precision (purity) of 84 per cent. In general, random
forests performed best, followed by the k-nearest neighbour and the SkyNet
artificial neural net algorithms, compared to other methods such as na\"ive
Bayes and kernel support vector machine. Our results show that PCA-based
machine learning can match human success levels and can naturally be extended
by including multiple epochs of data, transient colours and host galaxy
information which should allow for significant further improvements, especially
at low signal-to-noise.Comment: 14 pages, 8 figures. In this version extremely minor adjustments to
the paper were made - e.g. Figure 5 is now easier to view in greyscal
Data Driven Discovery in Astrophysics
We review some aspects of the current state of data-intensive astronomy, its
methods, and some outstanding data analysis challenges. Astronomy is at the
forefront of "big data" science, with exponentially growing data volumes and
data rates, and an ever-increasing complexity, now entering the Petascale
regime. Telescopes and observatories from both ground and space, covering a
full range of wavelengths, feed the data via processing pipelines into
dedicated archives, where they can be accessed for scientific analysis. Most of
the large archives are connected through the Virtual Observatory framework,
that provides interoperability standards and services, and effectively
constitutes a global data grid of astronomy. Making discoveries in this
overabundance of data requires applications of novel, machine learning tools.
We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data
from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure
How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging
We present the results of applying new object classification techniques to
difference images in the context of the Nearby Supernova Factory supernova
search. Most current supernova searches subtract reference images from new
images, identify objects in these difference images, and apply simple threshold
cuts on parameters such as statistical significance, shape, and motion to
reject objects such as cosmic rays, asteroids, and subtraction artifacts.
Although most static objects subtract cleanly, even a very low false positive
detection rate can lead to hundreds of non-supernova candidates which must be
vetted by human inspection before triggering additional followup. In comparison
to simple threshold cuts, more sophisticated methods such as Boosted Decision
Trees, Random Forests, and Support Vector Machines provide dramatically better
object discrimination. At the Nearby Supernova Factory, we reduced the number
of non-supernova candidates by a factor of 10 while increasing our supernova
identification efficiency. Methods such as these will be crucial for
maintaining a reasonable false positive rate in the automated transient alert
pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap
Intermediate Palomar Transient Factory: Realtime Image Subtraction Pipeline
A fast-turnaround pipeline for realtime data reduction plays an essential
role in discovering and permitting follow-up observations to young supernovae
and fast-evolving transients in modern time-domain surveys. In this paper, we
present the realtime image subtraction pipeline in the intermediate Palomar
Transient Factory. By using high-performance computing, efficient database, and
machine learning algorithms, this pipeline manages to reliably deliver
transient candidates within ten minutes of images being taken. Our experience
in using high performance computing resources to process big data in astronomy
serves as a trailblazer to dealing with data from large-scale time-domain
facilities in near future.Comment: 18 pages, 6 figures, accepted for publication in PAS
Machine learning in astronomy
The search to find answers to the deepest questions we have about the Universe has fueled the collection of data for ever larger volumes of our cosmos. The field of supernova cosmology, for example, is seeing continuous development with upcoming surveys set to produce a vast amount of data that will require new statistical inference and machine learning techniques for processing and analysis. Distinguishing between real objects and artefacts is one of the first steps in any transient science pipeline and, currently, is still carried out by humans - often leading to hand scanners having to sort hundreds or thousands of images per night. This is a time-consuming activity introducing human biases that are extremely hard to characterise. To succeed in the objectives of future transient surveys, the successful substitution of human hand scanners with machine learning techniques for the purpose of this artefact-transient classification therefore represents a vital frontier. In this thesis we test various machine learning algorithms and show that many of them can match the human hand scanner performance in classifying transient difference g, r and i-band imaging data from the SDSS-II SN Survey into real objects and artefacts. Using principal component analysis and linear discriminant analysis, we construct a grand total of 56 feature sets with which to train, optimise and test a Minimum Error Classifier (MEC), a naive Bayes classifier, a k-Nearest Neighbours (kNN) algorithm, a Support Vector Machine (SVM) and the SkyNet artificial neural network
Asteroid lightcurves from the Palomar Transient Factory survey: Rotation periods and phase functions from sparse photometry
We fit 54,296 sparsely-sampled asteroid lightcurves in the Palomar Transient
Factory to a combined rotation plus phase-function model. Each lightcurve
consists of 20+ observations acquired in a single opposition. Using 805
asteroids in our sample that have reference periods in the literature, we find
the reliability of our fitted periods is a complicated function of the period,
amplitude, apparent magnitude and other attributes. Using the 805-asteroid
ground-truth sample, we train an automated classifier to estimate (along with
manual inspection) the validity of the remaining 53,000 fitted periods. By this
method we find 9,033 of our lightcurves (of 8,300 unique asteroids) have
reliable periods. Subsequent consideration of asteroids with multiple
lightcurve fits indicate 4% contamination in these reliable periods. For 3,902
lightcurves with sufficient phase-angle coverage and either a reliably-fit
period or low amplitude, we examine the distribution of several phase-function
parameters, none of which are bimodal though all correlate with the bond albedo
and with visible-band colors. Comparing the theoretical maximal spin rate of a
fluid body with our amplitude versus spin-rate distribution suggests that, if
held together only by self-gravity, most asteroids are in general less dense
than 2 g/cm, while C types have a lower limit of between 1 and 2 g/cm,
in agreement with previous density estimates. For 5-20km diameters, S types
rotate faster and have lower amplitudes than C types. If both populations share
the same angular momentum, this may indicate the two types' differing ability
to deform under rotational stress. Lastly, we compare our absolute magnitudes
and apparent-magnitude residuals to those of the Minor Planet Center's nominal
, rotation-neglecting model; our phase-function plus Fourier-series
fitting reduces asteroid photometric RMS scatter by a factor of 3.Comment: 35 pages, 29 figures. Accepted 15-Apr-2015 to The Astronomical
Journal (AJ). Supplementary material including ASCII data tables will be
available through the publishing journal's websit
- …