Stein variational gradient descent (SVGD) was recently proposed as a general
purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]:
it minimizes the Kullback-Leibler divergence between the target distribution
and its approximation by implementing a form of functional gradient descent on
a reproducing kernel Hilbert space. In this paper, we accelerate and generalize
the SVGD algorithm by including second-order information, thereby approximating
a Newton-like iteration in function space. We also show how second-order
information can lead to more effective choices of kernel. We observe
significant computational gains over the original SVGD algorithm in multiple
test cases.Comment: 18 pages, 7 figure