Efficient posterior sampling for high-dimensional imbalanced logistic
  regression

Dunson, David; Lu, Jianfeng; Sachs, Matthias; Sen, Deborshee

research

Efficient posterior sampling for high-dimensional imbalanced logistic regression

Authors: David Dunson
Jianfeng Lu
Matthias Sachs
Deborshee Sen
Publication date: 14 November 2019
Publisher
Doi

Abstract

High-dimensional data are routinely collected in many areas. We are particularly interested in Bayesian classification models in which one or more variables are imbalanced. Current Markov chain Monte Carlo algorithms for posterior computation are inefficient as

n

and/or

p

increase due to worsening time per step and mixing rates. One strategy is to use a gradient-based sampler to improve mixing while using data sub-samples to reduce per-step computational complexity. However, usual sub-sampling breaks down when applied to imbalanced data. Instead, we generalize piece-wise deterministic Markov chain Monte Carlo algorithms to include importance-weighted and mini-batch sub-sampling. These approaches maintain the correct stationary distribution with arbitrarily small sub-samples, and substantially outperform current competitors. We provide theoretical support and illustrate gains in simulated and real data applications.Comment: 4 figure

Similar works

Full text

Available Versions

University of Birmingham Research Portal

oai:pure.atira.dk:openaire_cri...

Last time updated on 16/09/2023

University of Birmingham Research Portal

oai:pure.atira.dk:publications...

Last time updated on 16/09/2023