Search CORE

12,096 research outputs found

Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families

Author: Gretton Arthur
Livingstone Samuel
Sejdinovic Dino
Strathmann Heiko
Szabo Zoltan
Publication venue
Publication date: 01/01/2015
Field of study

We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities where classical HMC is not an option due to intractable gradients, KMC adaptively learns the target's gradient structure by fitting an exponential family model in a Reproducing Kernel Hilbert Space. Computational costs are reduced by two novel efficient approximations to this gradient. While being asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and offers substantial mixing improvements over state-of-the-art gradient free samplers. We support our claims with experimental studies on both toy and real-world applications, including Approximate Bayesian Computation and exact-approximate MCMC.Comment: 20 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Oxford University Research Archive

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

Author: Beirami Ahmad
Shahrampour Shahin
Tarokh Vahid
Publication venue
Publication date: 19/12/2017
Field of study

The randomized-feature approach has been successfully employed in large-scale kernel approximation and supervised learning. The distribution from which the random features are drawn impacts the number of features required to efficiently perform a learning task. Recently, it has been shown that employing data-dependent randomization improves the performance in terms of the required number of random features. In this paper, we are concerned with the randomized-feature approach in supervised learning for good generalizability. We propose the Energy-based Exploration of Random Features (EERF) algorithm based on a data-dependent score function that explores the set of possible features and exploits the promising regions. We prove that the proposed score function with high probability recovers the spectrum of the best fit within the model class. Our empirical results on several benchmark datasets further verify that our method requires smaller number of random features to achieve a certain generalization error compared to the state-of-the-art while introducing negligible pre-processing overhead. EERF can be implemented in a few lines of code and requires no additional tuning parameters.Comment: 12 pages; (pages 1-8) to appear in Proc. of AAAI Conference on Artificial Intelligence (AAAI), 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications