Upcoming astronomical surveys will observe billions of galaxies across cosmic
time, providing a unique opportunity to map the many pathways of galaxy
assembly to an incredibly high resolution. However, the huge amount of data
also poses an immediate computational challenge: current tools for inferring
parameters from the light of galaxies take ≳10 hours per fit. This is
prohibitively expensive. Simulation-based Inference (SBI) is a promising
solution. However, it requires simulated data with identical characteristics to
the observed data, whereas real astronomical surveys are often highly
heterogeneous, with missing observations and variable uncertainties determined
by sky and telescope conditions. Here we present a Monte Carlo technique for
treating out-of-distribution measurement errors and missing data using standard
SBI tools. We show that out-of-distribution measurement errors can be
approximated by using standard SBI evaluations, and that missing data can be
marginalized over using SBI evaluations over nearby data realizations in the
training set. While these techniques slow the inference process from ∼1
sec to ∼1.5 min per object, this is still significantly faster than
standard approaches while also dramatically expanding the applicability of SBI.
This expanded regime has broad implications for future applications to
astronomical surveys.Comment: 8 pages, 2 figures, accepted to the Machine Learning and the Physical
Sciences workshop at NeurIPS 202