4,505 research outputs found
An Ensemble of Bayesian Neural Networks for Exoplanetary Atmospheric Retrieval
Machine learning is now used in many areas of astrophysics, from detecting
exoplanets in Kepler transit signals to removing telescope systematics. Recent
work demonstrated the potential of using machine learning algorithms for
atmospheric retrieval by implementing a random forest to perform retrievals in
seconds that are consistent with the traditional, computationally-expensive
nested-sampling retrieval method. We expand upon their approach by presenting a
new machine learning model, \texttt{plan-net}, based on an ensemble of Bayesian
neural networks that yields more accurate inferences than the random forest for
the same data set of synthetic transmission spectra. We demonstrate that an
ensemble provides greater accuracy and more robust uncertainties than a single
model. In addition to being the first to use Bayesian neural networks for
atmospheric retrieval, we also introduce a new loss function for Bayesian
neural networks that learns correlations between the model outputs.
Importantly, we show that designing machine learning models to explicitly
incorporate domain-specific knowledge both improves performance and provides
additional insight by inferring the covariance of the retrieved atmospheric
parameters. We apply \texttt{plan-net} to the Hubble Space Telescope Wide Field
Camera 3 transmission spectrum for WASP-12b and retrieve an isothermal
temperature and water abundance consistent with the literature. We highlight
that our method is flexible and can be expanded to higher-resolution spectra
and a larger number of atmospheric parameters
Small-variance asymptotics for Bayesian neural networks
Bayesian neural networks (BNNs) are a rich and flexible class of models that have several advantages over standard feedforward networks, but are typically expensive to train on large-scale data. In this thesis, we explore the use of small-variance asymptotics-an approach to yielding fast algorithms from probabilistic models-on various Bayesian neural network models. We first demonstrate how small-variance asymptotics shows precise connections between standard neural networks and BNNs; for example, particular sampling algorithms for BNNs reduce to standard backpropagation in the small-variance limit. We then explore a more complex BNN where the number of hidden units is additionally treated as a random variable in the model. While standard sampling schemes would be too slow to be practical, our asymptotic approach yields a simple method for extending standard backpropagation to the case where the number of hidden units is not fixed. We show on several data sets that the resulting algorithm has benefits over backpropagation on networks with a fixed architecture.2019-01-02T00:00:00
Dropout Distillation for Efficiently Estimating Model Confidence
We propose an efficient way to output better calibrated uncertainty scores
from neural networks. The Distilled Dropout Network (DDN) makes standard
(non-Bayesian) neural networks more introspective by adding a new training loss
which prevents them from being overconfident. Our method is more efficient than
Bayesian neural networks or model ensembles which, despite providing more
reliable uncertainty scores, are more cumbersome to train and slower to test.
We evaluate DDN on the the task of image classification on the CIFAR-10 dataset
and show that our calibration results are competitive even when compared to 100
Monte Carlo samples from a dropout network while they also increase the
classification accuracy. We also propose better calibration within the state of
the art Faster R-CNN object detection framework and show, using the COCO
dataset, that DDN helps train better calibrated object detectors
- …