Search CORE

222,535 research outputs found

Minimum Rates of Approximate Sufficient Statistics

Author: Hayashi Masahito
Tan Vincent Y. F.
Publication venue
Publication date: 16/11/2017
Field of study

Given a sufficient statistic for a parametric family of distributions, one can estimate the parameter without access to the data. However, the memory or code size for storing the sufficient statistic may nonetheless still be prohibitive. Indeed, for

n

independent samples drawn from a

k

-nomial distribution with

d=k-1

degrees of freedom, the length of the code scales as

d\log n+O(1)

. In many applications, we may not have a useful notion of sufficient statistics (e.g., when the parametric family is not an exponential family) and we also may not need to reconstruct the generating distribution exactly. By adopting a Shannon-theoretic approach in which we allow a small error in estimating the generating distribution, we construct various {\em approximate sufficient statistics} and show that the code length can be reduced to

\frac{d}{2}\log n+O(1)

. We consider errors measured according to the relative entropy and variational distance criteria. For the code constructions, we leverage Rissanen's minimum description length principle, which yields a non-vanishing error measured according to the relative entropy. For the converse parts, we use Clarke and Barron's formula for the relative entropy of a parametrized distribution and the corresponding mixture distribution. However, this method only yields a weak converse for the variational distance. We develop new techniques to achieve vanishing errors and we also prove strong converses. The latter means that even if the code is allowed to have a non-vanishing error, its length must still be at least

\frac{d}{2}\log n

.Comment: To appear in the IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Crossref

Approximate entropy as an indicator of non-linearity in self paced voluntary finger movement EEG

Author: Balli Tugce
Palaniappan Ramaswamy
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2013
Field of study

This study investigates the indications of non-linear dynamic structures in electroencephalogram signals. The iterative amplitude adjusted surrogate data method along with seven non-linear test statistics namely the third order autocorrelation, asymmetry due to time reversal, delay vector variance method, correlation dimension, largest Lyapunov exponent, non-linear prediction error and approximate entropy has been used for analysing the EEG data obtained during self paced voluntary finger-movement. The results have demonstrated that there are clear indications of non-linearity in the EEG signals. However the rejection of the null hypothesis of non-linearity rate varied based on different parameter settings demonstrating significance of embedding dimension and time lag parameters for capturing underlying non-linear dynamics in the signals. Across non-linear test statistics, the highest degree of non-linearity was indicated by approximate entropy (APEN) feature regardless of the parameter settings

Crossref

Kent Academic Repository

Boosting Variational Inference: an Optimization Perspective

Author: Ghosh Joydeep
Khanna Rajiv
Locatello Francesco
Rätsch Gunnar
Publication venue
Publication date: 01/01/2018
Field of study

Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. However, as is the case with many other variational inference algorithms, its theoretical properties have not been studied. In the present work, we study the convergence properties of this approach from a modern optimization viewpoint by establishing connections to the classic Frank-Wolfe algorithm. Our analyses yields novel theoretical insights regarding the sufficient conditions for convergence, explicit rates, and algorithmic simplifications. Since a lot of focus in previous works for variational inference has been on tractability, our work is especially important as a much needed attempt to bridge the gap between probabilistic models and their corresponding theoretical properties

arXiv.org e-Print Archive

MPG.PuRe

Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression

Author: Knowles David A.
Salimans Tim
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/12/2013
Field of study

We propose a general algorithm for approximating nonstandard Bayesian posterior distributions. The algorithm minimizes the Kullback-Leibler divergence of an approximating distribution to the intractable posterior distribution. Our method can be used to approximate any posterior distribution, provided that it is given in closed form up to the proportionality constant. The approximation can be any distribution in the exponential family or any mixture of such distributions, which means that it can be made arbitrarily precise. Several examples illustrate the speed and accuracy of our approximation method in practice

arXiv.org e-Print Archive

Crossref

Erasmus University Digital Repository

Reliable ABC model choice via random forests

Author: Cornuet Jean-Marie
Estoup Arnaud
Gautier Mathieu
Marin Jean-Michel
Pudlo Pierre
Robert Christian P.
Publication venue
Publication date: 02/09/2015
Field of study

Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. We propose a novel approach based on a machine learning tool named random forests to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with random forests and postponing the approximation of the posterior probability of the predicted MAP for a second stage also relying on random forests. Compared with earlier implementations of ABC model choice, the ABC random forest approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least fifty), and (iv) it includes an approximation of the posterior probability of the selected model. The call to random forests will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. The proposed methodologies are implemented in the R package abcrf available on the CRAN.Comment: 39 pages, 15 figures, 6 table

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

HAL Descartes

Warwick Research Archives Portal Repository

HAL-CIRAD

How has the UK corporation tax raised so much revenue?

Author: Devereux M.P.
Griffith R.
Klemm A.
Publication venue: 'Institute for Fiscal Studies'
Publication date: 01/02/2004
Field of study

We analyse a puzzle in the UK corporation tax: by both historic and international standards corporation tax revenues have been high while the statutory rate has been low. Possible explanations include the following: changes in tax law that may have increased effective tax rates; other factors such as higher profitability or different macro-economic conditions may have led to higher effective tax rates; and finally the size of the corporate sector may have increased. We find evidence for all three explanations, although none would be sufficient in itself. To the extent that higher profits, particularly financial sector profits may have led to high revenues, there are doubts as to whether revenues will continue to be so strong

UCL Discovery

Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families

Author: Gretton Arthur
Livingstone Samuel
Sejdinovic Dino
Strathmann Heiko
Szabo Zoltan
Publication venue
Publication date: 01/01/2015
Field of study

We propose Kernel Hamiltonian Monte Carlo (KMC), a gradient-free adaptive MCMC algorithm based on Hamiltonian Monte Carlo (HMC). On target densities where classical HMC is not an option due to intractable gradients, KMC adaptively learns the target's gradient structure by fitting an exponential family model in a Reproducing Kernel Hilbert Space. Computational costs are reduced by two novel efficient approximations to this gradient. While being asymptotically exact, KMC mimics HMC in terms of sampling efficiency, and offers substantial mixing improvements over state-of-the-art gradient free samplers. We support our claims with experimental studies on both toy and real-world applications, including Approximate Bayesian Computation and exact-approximate MCMC.Comment: 20 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Oxford University Research Archive