25 research outputs found

    Bigger Buffer k-d Trees on Multi-Many-Core Systems

    Get PDF
    A buffer k-d tree is a k-d tree variant for massively-parallel nearest neighbor search. While providing valuable speed-ups on modern many-core devices in case both a large number of reference and query points are given, buffer k-d trees are limited by the amount of points that can fit on a single device. In this work, we show how to modify the original data structure and the associated workflow to make the overall approach capable of dealing with massive data sets. We further provide a simple yet efficient way of using multiple devices given in a single workstation. The applicability of the modified framework is demonstrated in the context of astronomy, a field that is faced with huge amounts of data

    Bigger Buffer k-d Trees on Multi-Many-Core Systems

    Get PDF
    A buffer k-d tree is a k-d tree variant for massively-parallel nearest neighbor search. While providing valuable speed-ups on modern many-core devices in case both a large number of reference and query points are given, buffer k-d trees are limited by the amount of points that can fit on a single device. In this work, we show how to modify the original data structure and the associated workflow to make the overall approach capable of dealing with massive data sets. We further provide a simple yet efficient way of using multiple devices given in a single workstation. The applicability of the modified framework is demonstrated in the context of astronomy, a field that is faced with huge amounts of data

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    A First Catalog of Variable Stars Measured by the Asteroid Terrestrial-impact Last Alert System (ATLAS)

    Full text link
    The Asteroid Terrestrial-impact Last Alert System (ATLAS) carries out its primary planetary defense mission by surveying about 13000 deg^2 at least four times per night. The resulting data set is useful for the discovery of variable stars to a magnitude limit fainter than r~18, with amplitudes down to 0.01 mag for bright objects. Here we present a Data Release One catalog of variable stars based on analyzing 142 million stars measured at least 100 times in the first two years of ATLAS operations. Using a Lomb-Scargle periodogram and other variability metrics, we identify 4.7 million candidate variables which we analyze in detail. Through Space Telescope Science Institute, we publicly release lightcurves for all of them, together with a vector of 169 classification features for each star. We do this at the level of unconfirmed candidate variables in order to provide the community with a large set of homogeneously analyzed photometry and avoid pre-judging which types of objects others may find most interesting. We use machine learning to classify the candidates into fifteen different broad categories based on lightcurve morphology. About 10% (430,000 stars) pass extensive tests designed to screen out spurious variability detections: we label these as `probable' variables. Of these, 230,000 receive specific classifications as eclipsing binaries, pulsating, Mira-type, or sinusoidal variables: these are the `classified' variables. New discoveries among the probable variables number more than 300,000, while 150,000 of the classified variables are new, including about 10,000 pulsating variables, 2,000 Mira stars, and 70,000 eclipsing binaries.Comment: Accepted by AJ; gives instructions for querying ATLAS variable star database; this new version has nicer lightcurve figure

    The Evryscope Fast Transient Engine: Real-time Discovery of Rapidly Evolving Transients with Evryscope and the Argus Optical Array

    Get PDF
    Modern synoptic sky surveys are typically designed to detect supernovae-like transients, using a tiling strategy to identify objects that evolve on day-to-month timescales. Astrophysical phenomena with sub-hour durations, ranging from galactic stellar flares to optical flashes accompanying gamma-ray bursts, have largely escaped scrutiny. Due to their low intrinsic rates and short durations, surveys for fast transients must simultaneously cover significant fractions of the sky at sub-hour cadences, often by combining multiple telescopes. The Evryscopes represent an extreme of this approach, combining 43 small telescopes to image 38% of the entire sky every two minutes. To investigate bright and fast transients with the Evryscopes, I developed the Evryscope Fast Transient Engine (EFTE), a real-time transient detection and photometric analysis pipeline. EFTE uses a unique direct image subtraction routine suited to continuously monitoring the transient sky at minute cadence. Candidates are produced within two minutes for 98.5% of images, and are internally filtered using VetNet, a machine learning algorithm trained to sort real astrophysical events from false positives, both instrumental and astronomical, including millisecond-timescale reflections, or “glints” from satellites and debris in Earth orbit. Glints are a dominating foreground for astronomical surveys in the extreme time domain. I present the first measurements of the glint rate, noting that it exceeds the combined rate of public alerts from all active all-sky, fast-timescale transient searches, including neutrino, gravitational-wave, gamma-ray, and radio observatories. I further report spectroscopic followup of two stellar flares identified in real-time from the EFTE alert stream using glint-mitigation and science-driven selection metrics. These are the closest spectra relative to peak ever observed for flare stars outside of dedicated starting campaigns on known active stars, and provide unique constraints on the evolution of the flare continuum and temperature. Finally, EFTE is the software test bed for the pipelines of the Argus Optical Array, an upcoming all-sky survey based on the Evryscope concept scaled to the depths of the deepest operating sky surveys and a terabit per second data rate. This work concludes with a description of the Argus prototype series and pipelines, and an overview of fast transient science with the Array.Doctor of Philosoph
    corecore