2 research outputs found
Efficiently Extracting Randomness from Imperfect Stochastic Processes
We study the problem of extracting a prescribed number of random bits by
reading the smallest possible number of symbols from non-ideal stochastic
processes. The related interval algorithm proposed by Han and Hoshi has
asymptotically optimal performance; however, it assumes that the distribution
of the input stochastic process is known. The motivation for our work is the
fact that, in practice, sources of randomness have inherent correlations and
are affected by measurement's noise. Namely, it is hard to obtain an accurate
estimation of the distribution. This challenge was addressed by the concepts of
seeded and seedless extractors that can handle general random sources with
unknown distributions. However, known seeded and seedless extractors provide
extraction efficiencies that are substantially smaller than Shannon's entropy
limit. Our main contribution is the design of extractors that have a variable
input-length and a fixed output length, are efficient in the consumption of
symbols from the source, are capable of generating random bits from general
stochastic processes and approach the information theoretic upper bound on
efficiency.Comment: 2 columns, 16 page
Streaming Algorithms for Optimal Generation of Random Bits 1
Generating random bits from a source of biased coins (the biased is unknown) is a classical question that was originally studied by von Neumann. There are a number of known algorithms that have asymptotically optimal information efficiency, namely, the expected number of generated random bits per input bit is asymptotically close to the entropy of the source. However, only the original von Neumann algorithm has a ‘streaming property ’- it operates on a single input bit at a time and it generates random bits when possible, alas, it does not have an optimal information efficiency. The main contribution of this paper is an algorithm that generates random bit streams from biased coins, uses bounded space and runs in expected linear time. As the size of the allotted space increases, the algorithm approaches the information-theoretic upper bound on efficiency. In addition, we present a universal scheme for transforming an arbitrary algorithm for binary sources to manage the general source of an m-sided dice, hence, enabling the application of existing algorithms to general sources. We also consider extensions of our algorithm to correlated sources that are based on Markov chains. I