Search CORE

877 research outputs found

Machine Learning Classification of SDSS Transient Survey Images

Author: Bassett B. A.
Buisson L. du
Sivanandam N.
Smith M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/11/2015
Field of study

We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i-difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as na\"ive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.Comment: 14 pages, 8 figures. In this version extremely minor adjustments to the paper were made - e.g. Figure 5 is now easier to view in greyscal

arXiv.org e-Print Archive

CiteSeerX

Data Driven Discovery in Astrophysics

Author: Brescia M.
Cavuoti S.
Djorgovski S. G.
Donalek C.
Longo G.
Publication venue
Publication date: 01/01/2014
Field of study

We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging

Author: B. A. Weaver
Becker A. C.
C. Aragon
D. Wong
Fisher R. A.
Freund Y.
R. C. Thomas
R. Romano
S. Bailey
Zahn C. T.
Publication venue: 'University of Chicago Press'
Publication date: 02/05/2007
Field of study

We present the results of applying new object classification techniques to difference images in the context of the Nearby Supernova Factory supernova search. Most current supernova searches subtract reference images from new images, identify objects in these difference images, and apply simple threshold cuts on parameters such as statistical significance, shape, and motion to reject objects such as cosmic rays, asteroids, and subtraction artifacts. Although most static objects subtract cleanly, even a very low false positive detection rate can lead to hundreds of non-supernova candidates which must be vetted by human inspection before triggering additional followup. In comparison to simple threshold cuts, more sophisticated methods such as Boosted Decision Trees, Random Forests, and Support Vector Machines provide dramatically better object discrimination. At the Nearby Supernova Factory, we reduced the number of non-supernova candidates by a factor of 10 while increasing our supernova identification efficiency. Methods such as these will be crucial for maintaining a reasonable false positive rate in the automated transient alert pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

UNT Digital Library

Intermediate Palomar Transient Factory: Realtime Image Subtraction Pipeline

Author: Cao Yi
Kasliwal Mansi M
Nugent Peter E
Publication venue: 'IOP Publishing'
Publication date: 02/08/2016
Field of study

A fast-turnaround pipeline for realtime data reduction plays an essential role in discovering and permitting follow-up observations to young supernovae and fast-evolving transients in modern time-domain surveys. In this paper, we present the realtime image subtraction pipeline in the intermediate Palomar Transient Factory. By using high-performance computing, efficient database, and machine learning algorithms, this pipeline manages to reliably deliver transient candidates within ten minutes of images being taken. Our experience in using high performance computing resources to process big data in astronomy serves as a trailblazer to dealing with data from large-scale time-domain facilities in near future.Comment: 18 pages, 6 figures, accepted for publication in PAS

arXiv.org e-Print Archive

eScholarship - University of California

Caltech Authors

Machine learning in astronomy

Author: Du Buisson Lise
Publication venue: Department of Mathematics and Applied Mathematics
Publication date: 01/01/2015
Field of study

The search to find answers to the deepest questions we have about the Universe has fueled the collection of data for ever larger volumes of our cosmos. The field of supernova cosmology, for example, is seeing continuous development with upcoming surveys set to produce a vast amount of data that will require new statistical inference and machine learning techniques for processing and analysis. Distinguishing between real objects and artefacts is one of the first steps in any transient science pipeline and, currently, is still carried out by humans - often leading to hand scanners having to sort hundreds or thousands of images per night. This is a time-consuming activity introducing human biases that are extremely hard to characterise. To succeed in the objectives of future transient surveys, the successful substitution of human hand scanners with machine learning techniques for the purpose of this artefact-transient classification therefore represents a vital frontier. In this thesis we test various machine learning algorithms and show that many of them can match the human hand scanner performance in classifying transient difference g, r and i-band imaging data from the SDSS-II SN Survey into real objects and artefacts. Using principal component analysis and linear discriminant analysis, we construct a grand total of 56 feature sets with which to train, optimise and test a Minimum Error Classifier (MEC), a naive Bayes classifier, a k-Nearest Neighbours (kNN) algorithm, a Support Vector Machine (SVM) and the SkyNet artificial neural network

Cape Town University OpenUCT

Asteroid lightcurves from the Palomar Transient Factory survey: Rotation periods and phase functions from sparse photometry

Author: Chang Chan-Kao
Cheng Yu-Chi
Helou George
Ip Wing-Huen
Kinoshita Daisuke
Kulkarni Shrinivas
Laher Russ
Levitan David
Masci Frank
Ofek Eran O.
Prince Thomas A.
Surace Jason
Waszczak Adam
Publication venue: 'IOP Publishing'
Publication date: 15/04/2015
Field of study

We fit 54,296 sparsely-sampled asteroid lightcurves in the Palomar Transient Factory to a combined rotation plus phase-function model. Each lightcurve consists of 20+ observations acquired in a single opposition. Using 805 asteroids in our sample that have reference periods in the literature, we find the reliability of our fitted periods is a complicated function of the period, amplitude, apparent magnitude and other attributes. Using the 805-asteroid ground-truth sample, we train an automated classifier to estimate (along with manual inspection) the validity of the remaining 53,000 fitted periods. By this method we find 9,033 of our lightcurves (of 8,300 unique asteroids) have reliable periods. Subsequent consideration of asteroids with multiple lightcurve fits indicate 4% contamination in these reliable periods. For 3,902 lightcurves with sufficient phase-angle coverage and either a reliably-fit period or low amplitude, we examine the distribution of several phase-function parameters, none of which are bimodal though all correlate with the bond albedo and with visible-band colors. Comparing the theoretical maximal spin rate of a fluid body with our amplitude versus spin-rate distribution suggests that, if held together only by self-gravity, most asteroids are in general less dense than 2 g/cm

^3

, while C types have a lower limit of between 1 and 2 g/cm

^3

, in agreement with previous density estimates. For 5-20km diameters, S types rotate faster and have lower amplitudes than C types. If both populations share the same angular momentum, this may indicate the two types' differing ability to deform under rotational stress. Lastly, we compare our absolute magnitudes and apparent-magnitude residuals to those of the Minor Planet Center's nominal

G=0.15

, rotation-neglecting model; our phase-function plus Fourier-series fitting reduces asteroid photometric RMS scatter by a factor of 3.Comment: 35 pages, 29 figures. Accepted 15-Apr-2015 to The Astronomical Journal (AJ). Supplementary material including ASCII data tables will be available through the publishing journal's websit

arXiv.org e-Print Archive

Caltech Authors