2 research outputs found
: Markov Chain Monte Carlo Sampling in SRAM for Fast Bayesian Inference
This work discusses the implementation of Markov Chain Monte Carlo (MCMC)
sampling from an arbitrary Gaussian mixture model (GMM) within SRAM. We show a
novel architecture of SRAM by embedding it with random number generators
(RNGs), digital-to-analog converters (DACs), and analog-to-digital converters
(ADCs) so that SRAM arrays can be used for high performance Metropolis-Hastings
(MH) algorithm-based MCMC sampling. Most of the expensive computations are
performed within the SRAM and can be parallelized for high speed sampling. Our
iterative compute flow minimizes data movement during sampling. We characterize
power-performance trade-off of our design by simulating on 45 nm CMOS
technology. For a two-dimensional, two mixture GMM, the implementation consumes
~ 91 micro-Watts power per sampling iteration and produces 500 samples in 2000
clock cycles on an average at 1 GHz clock frequency. Our study highlights
interesting insights on how low-level hardware non-idealities can affect
high-level sampling characteristics, and recommends ways to optimally operate
SRAM within area/power constraints for high performance sampling.Comment: This paper has been accepted at the IEEE International Symposium on
Circuits and Systems (ISCAS) to be held in May 2020 at Seville, Spai
Beyond Application End-Point Results: Quantifying Statistical Robustness of MCMC Accelerators
Statistical machine learning often uses probabilistic algorithms, such as
Markov Chain Monte Carlo (MCMC), to solve a wide range of problems.
Probabilistic computations, often considered too slow on conventional
processors, can be accelerated with specialized hardware by exploiting
parallelism and optimizing the design using various approximation techniques.
Current methodologies for evaluating correctness of probabilistic accelerators
are often incomplete, mostly focusing only on end-point result quality
("accuracy"). It is important for hardware designers and domain experts to look
beyond end-point "accuracy" and be aware of the hardware optimizations impact
on other statistical properties.
This work takes a first step towards defining metrics and a methodology for
quantitatively evaluating correctness of probabilistic accelerators beyond
end-point result quality. We propose three pillars of statistical robustness:
1) sampling quality, 2) convergence diagnostic, and 3) goodness of fit. We
apply our framework to a representative MCMC accelerator and surface design
issues that cannot be exposed using only application end-point result quality.
Applying the framework to guide design space exploration shows that statistical
robustness comparable to floating-point software can be achieved by slightly
increasing the bit representation, without floating-point hardware
requirements