37 research outputs found
Deep-HiTS: Rotation Invariant Convolutional Neural Network for Transient Detection
We introduce Deep-HiTS, a rotation invariant convolutional neural network
(CNN) model for classifying images of transients candidates into artifacts or
real sources for the High cadence Transient Survey (HiTS). CNNs have the
advantage of learning the features automatically from the data while achieving
high performance. We compare our CNN model against a feature engineering
approach using random forests (RF). We show that our CNN significantly
outperforms the RF model reducing the error by almost half. Furthermore, for a
fixed number of approximately 2,000 allowed false transient candidates per
night we are able to reduce the miss-classified real transients by
approximately 1/5. To the best of our knowledge, this is the first time CNNs
have been used to detect astronomical transient events. Our approach will be
very useful when processing images from next generation instruments such as the
Large Synoptic Survey Telescope (LSST). We have made all our code and data
available to the community for the sake of allowing further developments and
comparisons at https://github.com/guille-c/Deep-HiTS
ConDA: Simplifying Semi-supervised Domain Adaptation by Learning Consistent and Contrastive Feature Representations
In this work, we present ConDA, a simple framework that extends recent
advances in semi-supervised learning to the semi-supervised domain adaptation
(SSDA) problem. Our framework generates pairs of associated samples by
performing stochastic data transformations to a given input. Associated data
pairs are mapped to a feature representation space using a feature extractor.
We use different loss functions to enforce consistency between the feature
representations of associated data pairs of samples. We show that these learned
representations are useful to deal with differences in data distributions in
the domain adaptation problem. We performed experiments to study the main
components of our model and we show that (i) learning of the consistent and
contrastive feature representations is crucial to extract good discriminative
features across different domains, and ii) our model benefits from the use of
strong augmentation policies. With these findings, our method achieves
state-of-the-art performances in three benchmark datasets for SSDA.Comment: 11 pages, 3 figures, 4 table
Mitigating Bias in Deep Learning: Training Unbiased Models on Biased Data for the Morphological Classification of Galaxies
Galaxy morphologies and their relation with physical properties have been a
relevant subject of study in the past. Most galaxy morphology catalogs have
been labelled by human annotators or by machine learning models trained on
human labelled data. Human generated labels have been shown to contain biases
in terms of the observational properties of the data, such as image resolution.
These biases are independent of the annotators, that is, are present even in
catalogs labelled by experts. In this work, we demonstrate that training deep
learning models on biased galaxy data produce biased models, meaning that the
biases in the training data are transferred to the predictions of the new
models. We also propose a method to train deep learning models that considers
this inherent labelling bias, to obtain a de-biased model even when training on
biased data. We show that models trained using our deep de-biasing method are
capable of reducing the bias of human labelled datasets
Domain Adaptation via Minimax Entropy for Real/Bogus Classification of Astronomical Alerts
Time domain astronomy is advancing towards the analysis of multiple massive
datasets in real time, prompting the development of multi-stream machine
learning models. In this work, we study Domain Adaptation (DA) for real/bogus
classification of astronomical alerts using four different datasets: HiTS, DES,
ATLAS, and ZTF. We study the domain shift between these datasets, and improve a
naive deep learning classification model by using a fine tuning approach and
semi-supervised deep DA via Minimax Entropy (MME). We compare the balanced
accuracy of these models for different source-target scenarios. We find that
both the fine tuning and MME models improve significantly the base model with
as few as one labeled item per class coming from the target dataset, but that
the MME does not compromise its performance on the source dataset
Enhanced Rotational Invariant Convolutional Neural Network for Supernovae Detection
In this paper, we propose an enhanced CNN model for detecting supernovae
(SNe). This is done by applying a new method for obtaining rotational
invariance that exploits cyclic symmetry. In addition, we use a visualization
approach, the layer-wise relevance propagation (LRP) method, which allows
finding the relevant pixels in each image that contribute to discriminate
between SN candidates and artifacts. We introduce a measure to assess
quantitatively the effect of the rotational invariant methods on the LRP
relevance heatmaps. This allows comparing the proposed method, CAP, with the
original Deep-HiTS model. The results show that the enhanced method presents an
augmented capacity for achieving rotational invariance with respect to the
original model. An ensemble of CAP models obtained the best results so far on
the HiTS dataset, reaching an average accuracy of 99.53%. The improvement over
Deep-HiTS is significant both statistically and in practice.Comment: 8 pages, 5 figures. Accepted for publication in proceedings of the
IEEE World Congress on Computational Intelligence (IEEE WCCI), Rio de
Janeiro, Brazil, 8-13 July, 201
Positional Encodings for Light Curve Transformers: Playing with Positions and Attention
We conducted empirical experiments to assess the transferability of a light
curve transformer to datasets with different cadences and magnitude
distributions using various positional encodings (PEs). We proposed a new
approach to incorporate the temporal information directly to the output of the
last attention layer. Our results indicated that using trainable PEs lead to
significant improvements in the transformer performances and training times.
Our proposed PE on attention can be trained faster than the traditional
non-trainable PE transformer while achieving competitive results when
transfered to other datasets.Comment: In Proceedings of the 40th International Conference on Machine
Learning (ICML), Workshop on Machine Learning for Astrophysics, PMLR 202,
2023, Honolulu, Hawaii, US