17 research outputs found
Learning from the machine: interpreting machine learning algorithms for point- and extended- source classification
We investigate star-galaxy classification for astronomical surveys in the
context of four methods enabling the interpretation of black-box machine
learning systems. The first is outputting and exploring the decision boundaries
as given by decision tree based methods, which enables the visualization of the
classification categories. Secondly, we investigate how the Mutual Information
based Transductive Feature Selection (MINT) algorithm can be used to perform
feature pre-selection. If one would like to provide only a small number of
input features to a machine learning classification algorithm, feature
pre-selection provides a method to determine which of the many possible input
properties should be selected. Third is the use of the tree-interpreter package
to enable popular decision tree based ensemble methods to be opened,
visualized, and understood. This is done by additional analysis of the tree
based model, determining not only which features are important to the model,
but how important a feature is for a particular classification given its value.
Lastly, we use decision boundaries from the model to revise an already existing
method of classification, essentially asking the tree based method where
decision boundaries are best placed and defining a new classification method.
We showcase these techniques by applying them to the problem of star-galaxy
separation using data from the Sloan Digital Sky Survey (hereafter SDSS). We
use the output of MINT and the ensemble methods to demonstrate how more complex
decision boundaries improve star-galaxy classification accuracy over the
standard SDSS frames approach (reducing misclassifications by up to
). We then show how tree-interpreter can be used to explore how
relevant each photometric feature is when making a classification on an object
by object basis.Comment: 12 pages, 8 figures, 8 table
Low-latency gravitational wave alert products and their performance in anticipation of the fourth LIGO-Virgo-KAGRA observing run
Multi-messenger searches for binary neutron star (BNS) and neutron star-black
hole (NSBH) mergers are currently one of the most exciting areas of astronomy.
The search for joint electromagnetic and neutrino counterparts to gravitational
wave (GW)s has resumed with Advanced LIGO (aLIGO)'s, Advanced Virgo (AdVirgo)'s
and KAGRA's fourth observing run (O4). To support this effort, public
semi-automated data products are sent in near real-time and include
localization and source properties to guide complementary observations.
Subsequent refinements, as and when available, are also relayed as updates. In
preparation for O4, we have conducted a study using a simulated population of
compact binaries and a Mock Data Challenge (MDC) in the form of a real-time
replay to optimize and profile the software infrastructure and scientific
deliverables. End-to-end performance was tested, including data ingestion,
running online search pipelines, performing annotations, and issuing alerts to
the astrophysics community. In this paper, we present an overview of the
low-latency infrastructure as well as an overview of the performance of the
data products to be released during O4 based on a MDC. We report on expected
median latencies for the preliminary alert of full bandwidth searches (29.5 s)
and for the creation of early warning triggers (-3.1 s), and show consistency
and accuracy of released data products using the MDC. This paper provides a
performance overview for LVK low-latency alert structure and data products
using the MDC in anticipation of O4
Candidate Massive Galaxies at z~4 in the Dark Energy Survey
Using stellar population models, we predicted that the Dark Energy Survey (DES) - due to its special combination of area (5000 deg. sq.) and depth () - would be in the position to detect massive ( M) galaxies at . We confront those theoretical calculations with the first deg. sq. of DES data reaching nominal depth. From a catalogue containing million sources, were found to have observed-frame vs colours within the locus predicted for massive galaxies. We further removed contamination by stars and artefacts, obtaining 606 galaxies lining up by the model selection box. We obtained their photometric redshifts and physical properties by fitting model templates spanning a wide range of star formation histories, reddening and redshift. Key to constrain the models is the addition, to the optical DES bands , , , , and , of near-IR , , data from the Vista Hemisphere Survey. We further applied several quality cuts to the fitting results, including goodness of fit and a unimodal redshift probability distribution. We finally select 233 candidates whose photometric redshift probability distribution function peaks around , have high stellar masses (M/M for a Salpeter IMF) and ages around 0.1 Gyr, i.e. formation redshift around 5. These properties match those of the progenitors of the most massive galaxies in the local universe. This is an ideal sample for spectroscopic follow-up to select the fraction of galaxies which is truly at high redshift. These initial results and those at the survey completion, which we shall push to higher redshifts, will set unprecedented constraints on galaxy formation, evolution, and the re-ionisation epoch
The Dark Energy Survey : more than dark energy â an overview
This overview paper describes the legacy prospect and discovery potential of the Dark Energy Survey (DES) beyond cosmological studies, illustrating it with examples from the DES early data. DES is using a wide-field camera (DECam) on the 4 m Blanco Telescope in Chile to image 5000 sq deg of the sky in five filters (grizY). By its completion, the survey is expected to have generated a catalogue of 300 million galaxies with photometric redshifts and 100 million stars. In addition, a time-domain survey search over 27 sq deg is expected to yield a sample of thousands of Type Ia supernovae and other transients. The main goals of DES are to characterize dark energy and dark matter, and to test alternative models of gravity; these goals will be pursued by studying large-scale structure, cluster counts, weak gravitational lensing and Type Ia supernovae. However, DES also provides a rich data set which allows us to study many other aspects of astrophysics. In this paper, we focus on additional science with DES, emphasizing areas where the survey makes a difference with respect to other current surveys. The paper illustrates, using early data (from âScience Verificationâ, and from the first, second and third seasons of observations), what DES can tell us about the Solar system, the Milky Way, galaxy evolution, quasars and other topics. In addition, we show that if the cosmological model is assumed to be +cold dark matter, then important astrophysics can be deduced from the primary DES probes. Highlights from DES early data include the discovery of 34 trans-Neptunian objects, 17 dwarf satellites of the Milky Way, one published z > 6 quasar (and more confirmed) and two published superluminous supernovae (and more confirmed)
The Eleventh and Twelfth Data Releases of the Sloan Digital Sky Survey: Final Data from SDSS-III
The third generation of the Sloan Digital Sky Survey (SDSS-III) took data from 2008 to 2014 using the original SDSS wide-field imager, the original and an upgraded multi-object fiber-fed optical spectrograph, a new near-infrared high-resolution spectrograph, and a novel optical interferometer. All of the data from SDSS-III are now made public. In particular, this paper describes Data Release 11 (DR11) including all data acquired through 2013 July, and Data Release 12 (DR12) adding data acquired through 2014 July (including all data included in previous data releases), marking the end of SDSS-III observing. Relative to our previous public release (DR10), DR12 adds one million new spectra of galaxies and quasars from the Baryon Oscillation Spectroscopic Survey (BOSS) over an additional 3000 deg2 of sky, more than triples the number of H-band spectra of stars as part of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE), and includes repeated accurate radial velocity measurements of 5500 stars from the Multi-object APO Radial Velocity Exoplanet Large-area Survey (MARVELS). The APOGEE outputs now include the measured abundances of 15 different elements for each star. In total, SDSS-III added 5200 deg2 of ugriz imaging; 155,520 spectra of 138,099 stars as part of the Sloan Exploration of Galactic Understanding and Evolution 2 (SEGUE-2) survey; 2,497,484 BOSS spectra of 1,372,737 galaxies, 294,512 quasars, and 247,216 stars over 9376 deg2; 618,080 APOGEE spectra of 156,593 stars; and 197,040 MARVELS spectra of 5513 stars. Since its first light in 1998, SDSS has imaged over 1/3 of the Celestial sphere in five bands and obtained over five million astronomical spectra. \ua9 2015. The American Astronomical Society
Low-latency gravitational wave alert products and their performance in anticipation of the fourth LIGO-Virgo-KAGRA observing run
International audienceMulti-messenger searches for binary neutron star (BNS) and neutron star-black hole (NSBH) mergers are currently one of the most exciting areas of astronomy. The search for joint electromagnetic and neutrino counterparts to gravitational wave (GW)s has resumed with Advanced LIGO (aLIGO)'s, Advanced Virgo (AdVirgo)'s and KAGRA's fourth observing run (O4). To support this effort, public semi-automated data products are sent in near real-time and include localization and source properties to guide complementary observations. Subsequent refinements, as and when available, are also relayed as updates. In preparation for O4, we have conducted a study using a simulated population of compact binaries and a Mock Data Challenge (MDC) in the form of a real-time replay to optimize and profile the software infrastructure and scientific deliverables. End-to-end performance was tested, including data ingestion, running online search pipelines, performing annotations, and issuing alerts to the astrophysics community. In this paper, we present an overview of the low-latency infrastructure as well as an overview of the performance of the data products to be released during O4 based on a MDC. We report on expected median latencies for the preliminary alert of full bandwidth searches (29.5 s) and for the creation of early warning triggers (-3.1 s), and show consistency and accuracy of released data products using the MDC. This paper provides a performance overview for LVK low-latency alert structure and data products using the MDC in anticipation of O4