20 research outputs found
Supernova search with active learning in ZTF DR3
We provide the first results from the complete SNAD adaptive learning
pipeline in the context of a broad scope of data from large-scale astronomical
surveys. The main goal of this work is to explore the potential of adaptive
learning techniques in application to big data sets. Our SNAD team used Active
Anomaly Discovery (AAD) as a tool to search for new supernova (SN) candidates
in the photometric data from the first 9.4 months of the Zwicky Transient
Facility (ZTF) survey, namely, between March 17 and December 31 2018 (58194 <
MJD < 58483). We analysed 70 ZTF fields at a high galactic latitude and
visually inspected 2100 outliers. This resulted in 104 SN-like objects being
found, 57 of which were reported to the Transient Name Server for the first
time and with 47 having previously been mentioned in other catalogues, either
as SNe with known types or as SN candidates. We visually inspected the
multi-colour light curves of the non-catalogued transients and performed
fittings with different supernova models to assign it to a probable photometric
class: Ia, Ib/c, IIP, IIL, or IIn. Moreover, we also identified unreported
slow-evolving transients that are good superluminous SN candidates, along with
a few other non-catalogued objects, such as red dwarf flares and active
galactic nuclei. Beyond confirming the effectiveness of human-machine
integration underlying the AAD strategy, our results shed light on potential
leaks in currently available pipelines. These findings can help avoid similar
losses in future large-scale astronomical surveys. Furthermore, the algorithm
enables direct searches of any type of data and based on any definition of an
anomaly set by the expert.Comment: 22 pages with appendix, 12 figures, 2 tables, accepted for
publication in Astronomy and Astrophysic
LSST: from Science Drivers to Reference Design and Anticipated Data Products
(Abridged) We describe here the most ambitious survey currently planned in
the optical, the Large Synoptic Survey Telescope (LSST). A vast array of
science will be enabled by a single wide-deep-fast sky survey, and LSST will
have unique survey capability in the faint time domain. The LSST design is
driven by four main science themes: probing dark energy and dark matter, taking
an inventory of the Solar System, exploring the transient optical sky, and
mapping the Milky Way. LSST will be a wide-field ground-based system sited at
Cerro Pach\'{o}n in northern Chile. The telescope will have an 8.4 m (6.5 m
effective) primary mirror, a 9.6 deg field of view, and a 3.2 Gigapixel
camera. The standard observing sequence will consist of pairs of 15-second
exposures in a given field, with two such visits in each pointing in a given
night. With these repeats, the LSST system is capable of imaging about 10,000
square degrees of sky in a single filter in three nights. The typical 5
point-source depth in a single visit in will be (AB). The
project is in the construction phase and will begin regular survey operations
by 2022. The survey area will be contained within 30,000 deg with
, and will be imaged multiple times in six bands, ,
covering the wavelength range 320--1050 nm. About 90\% of the observing time
will be devoted to a deep-wide-fast survey mode which will uniformly observe a
18,000 deg region about 800 times (summed over all six bands) during the
anticipated 10 years of operations, and yield a coadded map to . The
remaining 10\% of the observing time will be allocated to projects such as a
Very Deep and Fast time domain survey. The goal is to make LSST data products,
including a relational database of about 32 trillion observations of 40 billion
objects, available to the public and scientists around the world.Comment: 57 pages, 32 color figures, version with high-resolution figures
available from https://www.lsst.org/overvie
The European perspective for LSST
International audienceAbstractLSST is a next generation telescope that will produce an unprecedented data flow. The project goal is to deliver data products such as images and catalogs thus enabling scientific analysis for a wide community of users. As a large scale survey, LSST data will be complementary with other facilities in a wide range of scientific domains, including data from ESA or ESO. European countries have invested in LSST since 2007, in the construction of the camera as well as in the computing effort. This latter will be instrumental in designing the next step: how to distribute LSST data to Europe. Astroinformatics challenges for LSST indeed includes not only the analysis of LSST big data, but also the practical efficiency of the data access.</jats:p
Fonctions d'agrégation symétriques et asymétriques dans le calcul massivement parallèle
Applications of aggregation for information summary have great meanings in various fields. In big data era, processing aggregate function in parallel is drawing researchers' attention. The aim of our work is to propose a generic framework enabling to map an arbitrary aggregation into a generic algorithm and identify when it can be efficiently executed on modern large-scale data-processing systems. We describe our preliminary results regarding classes of symmetric and asymmetric aggregation that can be mapped, in a systematic way, into efficient MapReduce-style algorithms
Explainable classification of astronomical uncertain time series
Exploring the expansion history of the universe, understanding its evolutionary stages, and predicting its future evolution are important goals in astrophysics. Today, machine learning tools are used to help achieving these goals by analyzing transient sources, which are modeled as uncertain time series. Although black-box methods achieve appreciable performance, existing interpretable time series methods failed to obtain acceptable performance for this type of data. Furthermore, data uncertainty is rarely taken into account in these methods. In this work, we propose an uncertaintyaware subsequence based model which achieves a classification comparable to that of state-of-the-art methods. Unlike conformal learning which estimates model uncertainty on predictions, our method takes data uncertainty as additional input. Moreover, our approach is explainable-by-design, giving domain experts the ability to inspect the model and explain its predictions. The explainability of the proposed method has also the potential to inspire new developments in theoretical astrophysics modeling by suggesting important subsequences which depict details of light curve shapes. The dataset, the source code of our experiment, and the results are made available on a public repository
Explainable classification of astronomical uncertain time series
Exploring the expansion history of the universe, understanding its evolutionary stages, and predicting its future evolution are important goals in astrophysics. Today, machine learning tools are used to help achieving these goals by analyzing transient sources, which are modeled as uncertain time series. Although black-box methods achieve appreciable performance, existing interpretable time series methods failed to obtain acceptable performance for this type of data. Furthermore, data uncertainty is rarely taken into account in these methods. In this work, we propose an uncertaintyaware subsequence based model which achieves a classification comparable to that of state-of-the-art methods. Unlike conformal learning which estimates model uncertainty on predictions, our method takes data uncertainty as additional input. Moreover, our approach is explainable-by-design, giving domain experts the ability to inspect the model and explain its predictions. The explainability of the proposed method has also the potential to inspire new developments in theoretical astrophysics modeling by suggesting important subsequences which depict details of light curve shapes. The dataset, the source code of our experiment, and the results are made available on a public repository
Discovering Injective Mapping Between Relations in Astrophysics Databases
International audienceData in Astrophysics are very often structured with the rela-tional data model. One particularity is that every value is a real number and comes with an associated error measure, leading to a numerical interval [value − error, value + error]. Such Astrophysics databases can be seen as interval-based numerical databases. Classical data mining approach, specifically those related to integrity constraints, are likely to produce useless results on such databases, as the strict equality is very unlikely to give meaningful results. In this paper, we revisit a well-known problem, based on unary inclusion dependency discovery, to match the particularities of Astrophysics Databases. We propose to discover injective mapping between attributes of a source relation and a target relation. At first, we define two notions of inclusion between intervals. Then, we adapt a condensed representation proposed in [15] allowing to find a mapping function between the source and the target. The proposition has been implemented and several experiments have been conducted on both real-life and synthetic databases
Discovering Injective Mapping Between Relations in Astrophysics Databases
International audienceData in Astrophysics are very often structured with the rela-tional data model. One particularity is that every value is a real number and comes with an associated error measure, leading to a numerical interval [value − error, value + error]. Such Astrophysics databases can be seen as interval-based numerical databases. Classical data mining approach, specifically those related to integrity constraints, are likely to produce useless results on such databases, as the strict equality is very unlikely to give meaningful results. In this paper, we revisit a well-known problem, based on unary inclusion dependency discovery, to match the particularities of Astrophysics Databases. We propose to discover injective mapping between attributes of a source relation and a target relation. At first, we define two notions of inclusion between intervals. Then, we adapt a condensed representation proposed in [15] allowing to find a mapping function between the source and the target. The proposition has been implemented and several experiments have been conducted on both real-life and synthetic databases