877 research outputs found

    Machine Learning Classification of SDSS Transient Survey Images

    Full text link
    We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i-difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as na\"ive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.Comment: 14 pages, 8 figures. In this version extremely minor adjustments to the paper were made - e.g. Figure 5 is now easier to view in greyscal

    Data Driven Discovery in Astrophysics

    Get PDF
    We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure

    How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging

    Get PDF
    We present the results of applying new object classification techniques to difference images in the context of the Nearby Supernova Factory supernova search. Most current supernova searches subtract reference images from new images, identify objects in these difference images, and apply simple threshold cuts on parameters such as statistical significance, shape, and motion to reject objects such as cosmic rays, asteroids, and subtraction artifacts. Although most static objects subtract cleanly, even a very low false positive detection rate can lead to hundreds of non-supernova candidates which must be vetted by human inspection before triggering additional followup. In comparison to simple threshold cuts, more sophisticated methods such as Boosted Decision Trees, Random Forests, and Support Vector Machines provide dramatically better object discrimination. At the Nearby Supernova Factory, we reduced the number of non-supernova candidates by a factor of 10 while increasing our supernova identification efficiency. Methods such as these will be crucial for maintaining a reasonable false positive rate in the automated transient alert pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap

    Intermediate Palomar Transient Factory: Realtime Image Subtraction Pipeline

    Get PDF
    A fast-turnaround pipeline for realtime data reduction plays an essential role in discovering and permitting follow-up observations to young supernovae and fast-evolving transients in modern time-domain surveys. In this paper, we present the realtime image subtraction pipeline in the intermediate Palomar Transient Factory. By using high-performance computing, efficient database, and machine learning algorithms, this pipeline manages to reliably deliver transient candidates within ten minutes of images being taken. Our experience in using high performance computing resources to process big data in astronomy serves as a trailblazer to dealing with data from large-scale time-domain facilities in near future.Comment: 18 pages, 6 figures, accepted for publication in PAS

    Machine learning in astronomy

    Get PDF
    The search to find answers to the deepest questions we have about the Universe has fueled the collection of data for ever larger volumes of our cosmos. The field of supernova cosmology, for example, is seeing continuous development with upcoming surveys set to produce a vast amount of data that will require new statistical inference and machine learning techniques for processing and analysis. Distinguishing between real objects and artefacts is one of the first steps in any transient science pipeline and, currently, is still carried out by humans - often leading to hand scanners having to sort hundreds or thousands of images per night. This is a time-consuming activity introducing human biases that are extremely hard to characterise. To succeed in the objectives of future transient surveys, the successful substitution of human hand scanners with machine learning techniques for the purpose of this artefact-transient classification therefore represents a vital frontier. In this thesis we test various machine learning algorithms and show that many of them can match the human hand scanner performance in classifying transient difference g, r and i-band imaging data from the SDSS-II SN Survey into real objects and artefacts. Using principal component analysis and linear discriminant analysis, we construct a grand total of 56 feature sets with which to train, optimise and test a Minimum Error Classifier (MEC), a naive Bayes classifier, a k-Nearest Neighbours (kNN) algorithm, a Support Vector Machine (SVM) and the SkyNet artificial neural network

    Asteroid lightcurves from the Palomar Transient Factory survey: Rotation periods and phase functions from sparse photometry

    Get PDF
    We fit 54,296 sparsely-sampled asteroid lightcurves in the Palomar Transient Factory to a combined rotation plus phase-function model. Each lightcurve consists of 20+ observations acquired in a single opposition. Using 805 asteroids in our sample that have reference periods in the literature, we find the reliability of our fitted periods is a complicated function of the period, amplitude, apparent magnitude and other attributes. Using the 805-asteroid ground-truth sample, we train an automated classifier to estimate (along with manual inspection) the validity of the remaining 53,000 fitted periods. By this method we find 9,033 of our lightcurves (of 8,300 unique asteroids) have reliable periods. Subsequent consideration of asteroids with multiple lightcurve fits indicate 4% contamination in these reliable periods. For 3,902 lightcurves with sufficient phase-angle coverage and either a reliably-fit period or low amplitude, we examine the distribution of several phase-function parameters, none of which are bimodal though all correlate with the bond albedo and with visible-band colors. Comparing the theoretical maximal spin rate of a fluid body with our amplitude versus spin-rate distribution suggests that, if held together only by self-gravity, most asteroids are in general less dense than 2 g/cm3^3, while C types have a lower limit of between 1 and 2 g/cm3^3, in agreement with previous density estimates. For 5-20km diameters, S types rotate faster and have lower amplitudes than C types. If both populations share the same angular momentum, this may indicate the two types' differing ability to deform under rotational stress. Lastly, we compare our absolute magnitudes and apparent-magnitude residuals to those of the Minor Planet Center's nominal G=0.15G=0.15, rotation-neglecting model; our phase-function plus Fourier-series fitting reduces asteroid photometric RMS scatter by a factor of 3.Comment: 35 pages, 29 figures. Accepted 15-Apr-2015 to The Astronomical Journal (AJ). Supplementary material including ASCII data tables will be available through the publishing journal's websit
    • …
    corecore