37 research outputs found
Stochastic Gradient Annealed Importance Sampling for Efficient Online Marginal Likelihood Estimation
We consider estimating the marginal likelihood in settings with independent
and identically distributed (i.i.d.) data. We propose estimating the predictive
distributions in a sequential factorization of the marginal likelihood in such
settings by using stochastic gradient Markov Chain Monte Carlo techniques. This
approach is far more efficient than traditional marginal likelihood estimation
techniques such as nested sampling and annealed importance sampling due to its
use of mini-batches to approximate the likelihood. Stability of the estimates
is provided by an adaptive annealing schedule. The resulting stochastic
gradient annealed importance sampling (SGAIS) technique, which is the key
contribution of our paper, enables us to estimate the marginal likelihood of a
number of models considerably faster than traditional approaches, with no
noticeable loss of accuracy. An important benefit of our approach is that the
marginal likelihood is calculated in an online fashion as data becomes
available, allowing the estimates to be used for applications such as online
weighted model combination
BEA: Revisiting anchor-based object detection DNN using Budding Ensemble Architecture
This paper introduces the Budding Ensemble Architecture (BEA), a novel
reduced ensemble architecture for anchor-based object detection models. Object
detection models are crucial in vision-based tasks, particularly in autonomous
systems. They should provide precise bounding box detections while also
calibrating their predicted confidence scores, leading to higher-quality
uncertainty estimates. However, current models may make erroneous decisions due
to false positives receiving high scores or true positives being discarded due
to low scores. BEA aims to address these issues. The proposed loss functions in
BEA improve the confidence score calibration and lower the uncertainty error,
which results in a better distinction of true and false positives and,
eventually, higher accuracy of the object detection models. Both Base-YOLOv3
and SSD models were enhanced using the BEA method and its proposed loss
functions. The BEA on Base-YOLOv3 trained on the KITTI dataset results in a 6%
and 3.7% increase in mAP and AP50, respectively. Utilizing a well-balanced
uncertainty estimation threshold to discard samples in real-time even leads to
a 9.6% higher AP50 than its base model. This is attributed to a 40% increase in
the area under the AP50-based retention curve used to measure the quality of
calibration of confidence scores. Furthermore, BEA-YOLOV3 trained on KITTI
provides superior out-of-distribution detection on Citypersons, BDD100K, and
COCO datasets compared to the ensembles and vanilla models of YOLOv3 and
Gaussian-YOLOv3.Comment: 14 pages, 5 pages supplementary material. Accepted at BMVC-202
Bayesian Graph Neural Networks for Molecular Property Prediction
Graph neural networks for molecular property prediction are frequently
underspecified by data and fail to generalise to new scaffolds at test time. A
potential solution is Bayesian learning, which can capture our uncertainty in
the model parameters. This study benchmarks a set of Bayesian methods applied
to a directed MPNN, using the QM9 regression dataset. We find that capturing
uncertainty in both readout and message passing parameters yields enhanced
predictive accuracy, calibration, and performance on a downstream molecular
search task.Comment: Presented at NeurIPS 2020 Machine Learning for Molecules worksho