3,595 research outputs found

    Exponential Family Estimation via Adversarial Dynamics Embedding

    Get PDF
    We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks. We exploit the primal-dual view of the MLE with a kinetics augmented model to obtain an estimate associated with an adversarial dual sampler. To represent this sampler, we introduce a novel neural architecture, dynamics embedding, that generalizes Hamiltonian Monte-Carlo (HMC). The proposed approach inherits the flexibility of HMC while enabling tractable entropy estimation for the augmented model. By learning both a dual sampler and the primal model simultaneously, and sharing parameters between them, we obviate the requirement to design a separate sampling procedure once the model has been trained, leading to more effective learning. We show that many existing estimators, such as contrastive divergence, pseudo/composite-likelihood, score matching, minimum Stein discrepancy estimator, non-local contrastive objectives, noise-contrastive estimation, and minimum probability flow, are special cases of the proposed approach, each expressed by a different (fixed) dual sampler. An empirical investigation shows that adapting the sampler during MLE can significantly improve on state-of-the-art estimators

    The bracket geometry of statistics

    Get PDF
    In this thesis we build a geometric theory of Hamiltonian Monte Carlo, with an emphasis on symmetries and its bracket generalisations, construct the canonical geometry of smooth measures and Stein operators, and derive the complete recipe of measure-constraints preserving dynamics and diffusions on arbitrary manifolds. Specifically, we will explain the central role played by mechanics with symmetries to obtain efficient numerical integrators, and provide a general method to construct explicit integrators for HMC on geodesic orbit manifolds via symplectic reduction. Following ideas developed by Maxwell, Volterra, Poincaré, de Rham, Koszul, Dufour, Weinstein, and others, we will then show that any smooth distribution generates considerable geometric content, including ``musical" isomorphisms between multi-vector fields and twisted differential forms, and a boundary operator - the rotationnel, which, in particular, engenders the canonical Stein operator. We then introduce the ``bracket formalism" and its induced mechanics, a generalisation of Poisson mechanics and gradient flows that provides a general mechanism to associate unnormalised probability densities to flows depending on the score pointwise. Most importantly, we will characterise all measure-constraints preserving flows on arbitrary manifolds, showing the intimate relation between measure-preserving Nambu mechanics and closed twisted forms. Our results are canonical. As a special case we obtain the characterisation of measure-preserving bracket mechanical systems and measure-preserving diffusions, thus explaining and extending to manifolds the complete recipe of SGMCMC. We will discuss the geometry of Stein operators and extend the density approach by showing these are simply a reformulation of the exterior derivative on twisted forms satisfying Stokes' theorem. Combining the canonical Stein operator with brackets allows us to naturally recover the Riemannian and diffusion Stein operators as special cases. Finally, we shall introduce the minimum Stein discrepancy estimators, which provide a unifying perspective of parameter inference based on score matching, contrastive divergence, and minimum probability flow.Open Acces

    Generalized SURE for Exponential Families: Applications to Regularization

    Full text link
    Stein's unbiased risk estimate (SURE) was proposed by Stein for the independent, identically distributed (iid) Gaussian model in order to derive estimates that dominate least-squares (LS). In recent years, the SURE criterion has been employed in a variety of denoising problems for choosing regularization parameters that minimize an estimate of the mean-squared error (MSE). However, its use has been limited to the iid case which precludes many important applications. In this paper we begin by deriving a SURE counterpart for general, not necessarily iid distributions from the exponential family. This enables extending the SURE design technique to a much broader class of problems. Based on this generalization we suggest a new method for choosing regularization parameters in penalized LS estimators. We then demonstrate its superior performance over the conventional generalized cross validation approach and the discrepancy method in the context of image deblurring and deconvolution. The SURE technique can also be used to design estimates without predefining their structure. However, allowing for too many free parameters impairs the performance of the resulting estimates. To address this inherent tradeoff we propose a regularized SURE objective. Based on this design criterion, we derive a wavelet denoising strategy that is similar in sprit to the standard soft-threshold approach but can lead to improved MSE performance.Comment: to appear in the IEEE Transactions on Signal Processin

    A Riemannian-Stein Kernel Method

    Full text link
    This paper presents a theoretical analysis of numerical integration based on interpolation with a Stein kernel. In particular, the case of integrals with respect to a posterior distribution supported on a general Riemannian manifold is considered and the asymptotic convergence of the estimator in this context is established. Our results are considerably stronger than those previously reported, in that the optimal rate of convergence is established under a basic Sobolev-type assumption on the integrand. The theoretical results are empirically verified on S2\mathbb{S}^2

    Shrinkage Estimators in Online Experiments

    Full text link
    We develop and analyze empirical Bayes Stein-type estimators for use in the estimation of causal effects in large-scale online experiments. While online experiments are generally thought to be distinguished by their large sample size, we focus on the multiplicity of treatment groups. The typical analysis practice is to use simple differences-in-means (perhaps with covariate adjustment) as if all treatment arms were independent. In this work we develop consistent, small bias, shrinkage estimators for this setting. In addition to achieving lower mean squared error these estimators retain important frequentist properties such as coverage under most reasonable scenarios. Modern sequential methods of experimentation and optimization such as multi-armed bandit optimization (where treatment allocations adapt over time to prior responses) benefit from the use of our shrinkage estimators. Exploration under empirical Bayes focuses more efficiently on near-optimal arms, improving the resulting decisions made under uncertainty. We demonstrate these properties by examining seventeen large-scale experiments conducted on Facebook from April to June 2017

    Minimum Lq^{q}-distance estimators for non-normalized parametric models

    Get PDF
    We propose and investigate a new estimation method for the parameters of models consisting of smooth density functions on the positive half axis. The procedure is based on a recently introduced characterization result for the respective probability distributions, and is to be classified as a minimum distance estimator, incorporating as a distance function the Lq^{q}‐norm. Throughout, we deal rigorously with issues of existence and measurability of these implicitly defined estimators. Moreover, we provide consistency results in a common asymptotic setting, and compare our new method with classical estimators for the exponential, the Rayleigh and the Burr Type XII distribution in Monte Carlo simulation studies. We also assess the performance of different estimators for non‐normalized models in the context of an exponential‐polynomial family
    • 

    corecore