Recent advances in coreset methods have shown that a selection of
representative datapoints can replace massive volumes of data for Bayesian
inference, preserving the relevant statistical information and significantly
accelerating subsequent downstream tasks. Existing variational coreset
constructions rely on either selecting subsets of the observed datapoints, or
jointly performing approximate inference and optimizing pseudodata in the
observed space akin to inducing points methods in Gaussian Processes. So far,
both approaches are limited by complexities in evaluating their objectives for
general purpose models, and require generating samples from a typically
intractable posterior over the coreset throughout inference and testing. In
this work, we present a black-box variational inference framework for coresets
that overcomes these constraints and enables principled application of
variational coresets to intractable models, such as Bayesian neural networks.
We apply our techniques to supervised learning problems, and compare them with
existing approaches in the literature for data summarization and inference.Comment: NeurIPS 202