3 research outputs found

    Bayesian Lasso Posterior Sampling via Parallelized Measure Transport

    Full text link
    It is well known that the Lasso can be interpreted as a Bayesian posterior mode estimate with a Laplacian prior. Obtaining samples from the full posterior distribution, the Bayesian Lasso, confers major advantages in performance as compared to having only the Lasso point estimate. Traditionally, the Bayesian Lasso is implemented via Gibbs sampling methods which suffer from lack of scalability, unknown convergence rates, and generation of samples that are necessarily correlated. We provide a measure transport approach to generate i.i.d samples from the posterior by constructing a transport map that transforms a sample from the Laplacian prior into a sample from the posterior. We show how the construction of this transport map can be parallelized into modules that iteratively solve Lasso problems and perform closed-form linear algebra updates. With this posterior sampling method, we perform maximum likelihood estimation of the Lasso regularization parameter via the EM algorithm. We provide comparisons to traditional Gibbs samplers using the diabetes dataset of Efron et al. Lastly, we give an example implementation on a computing system that leverages parallelization, a graphics processing unit, whose execution time has much less dependence on dimension as compared to a standard implementation.Comment: 20 pages, 6 figure

    Concentration of information content for convex measures

    Full text link
    We establish sharp exponential deviation estimates of the information content as well as a sharp bound on the varentropy for the class of convex measures on Euclidean spaces. This generalizes a similar development for log-concave measures in the recent work of Fradelizi, Madiman and Wang (2016). In particular, our results imply that convex measures in high dimensions are concentrated in an annulus between two convex sets (as in the log-concave case) despite their possibly having much heavier tails. Various tools and consequences are developed, including a sharp comparison result for R\'enyi entropies, inequalities of Kahane-Khinchine type for convex measures that extend those of Koldobsky, Pajor and Yaskin (2008) for log-concave measures, and an extension of Berwald's inequality (1947).Comment: Added some reference

    Construction and Analysis of Posterior Matching in Arbitrary Dimensions via Optimal Transport

    Full text link
    The posterior matching scheme, for feedback encoding of a message point lying on the unit interval over memoryless channels, maximizes mutual information for an arbitrary number of channel uses. However, it in general does not always achieve any positive rate; so far, elaborate analyses have been required to show that it achieves any positive rate below capacity. More recent efforts have introduced a random "dither" shared by the encoder and decoder to the problem formulation, to simplify analyses and guarantee that the randomized scheme achieves any rate below capacity. Motivated by applications (e.g. human-computer interfaces) where (a) common randomness shared by the encoder and decoder may not be feasible and (b) the message point lies in a higher dimensional space, we focus here on the original formulation without common randomness, and use optimal transport theory to generalize the scheme for a message point in a higher dimensional space. By defining a stricter, almost sure, notion of message decoding, we use classical probabilistic techniques (e.g. change of measure and martingale convergence) to establish succinct necessary and sufficient conditions on when the message point can be recovered from infinite observations: Birkhoff ergodicity of a random process sequentially generated by the encoder. We also show a surprising "all or nothing" result: the same ergodicity condition is necessary and sufficient to achieve any rate below capacity. We provide applications of this message point framework in human-computer interfaces and multi-antenna communications.Comment: Submitted to the IEEE Transactions on Information Theor
    corecore