12,096 research outputs found
Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families
We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive
MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities
where classical HMC is not an option due to intractable gradients, KMC
adaptively learns the target's gradient structure by fitting an exponential
family model in a Reproducing Kernel Hilbert Space. Computational costs are
reduced by two novel efficient approximations to this gradient. While being
asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and
offers substantial mixing improvements over state-of-the-art gradient free
samplers. We support our claims with experimental studies on both toy and
real-world applications, including Approximate Bayesian Computation and
exact-approximate MCMC.Comment: 20 pages, 7 figure
On Data-Dependent Random Features for Improved Generalization in Supervised Learning
The randomized-feature approach has been successfully employed in large-scale
kernel approximation and supervised learning. The distribution from which the
random features are drawn impacts the number of features required to
efficiently perform a learning task. Recently, it has been shown that employing
data-dependent randomization improves the performance in terms of the required
number of random features. In this paper, we are concerned with the
randomized-feature approach in supervised learning for good generalizability.
We propose the Energy-based Exploration of Random Features (EERF) algorithm
based on a data-dependent score function that explores the set of possible
features and exploits the promising regions. We prove that the proposed score
function with high probability recovers the spectrum of the best fit within the
model class. Our empirical results on several benchmark datasets further verify
that our method requires smaller number of random features to achieve a certain
generalization error compared to the state-of-the-art while introducing
negligible pre-processing overhead. EERF can be implemented in a few lines of
code and requires no additional tuning parameters.Comment: 12 pages; (pages 1-8) to appear in Proc. of AAAI Conference on
Artificial Intelligence (AAAI), 201
- …