We introduce a multi-fidelity estimator of covariance matrices that employs
the log-Euclidean geometry of the symmetric positive-definite manifold. The
estimator fuses samples from a hierarchy of data sources of differing
fidelities and costs for variance reduction while guaranteeing definiteness, in
contrast with previous approaches. The new estimator makes covariance
estimation tractable in applications where simulation or data collection is
expensive; to that end, we develop an optimal sample allocation scheme that
minimizes the mean-squared error of the estimator given a fixed budget.
Guaranteed definiteness is crucial to metric learning, data assimilation, and
other downstream tasks. Evaluations of our approach using data from physical
applications (heat conduction, fluid dynamics) demonstrate more accurate metric
learning and speedups of more than one order of magnitude compared to
benchmarks.Comment: To appear at the International Conference on Machine Learning (ICML)
202