10,753 research outputs found
Device Independent Random Number Generation
Randomness is an invaluable resource in today's life with a broad use
reaching from numerical simulations through randomized algorithms to
cryptography. However, on the classical level no true randomness is available
and even the use of simple quantum devices in a prepare-measure setting suffers
from lack of stability and controllability. This gave rise to a group of
quantum protocols that provide randomness certified by classical statistical
tests -- Device Independent Quantum Random Number Generators. In this paper we
review the most relevant results in this field, which allow the production of
almost perfect randomness with help of quantum devices, supplemented with an
arbitrary weak source of additional randomness. This is in fact the best one
could hope for to achieve, as with no starting randomness (corresponding to no
free will in a different concept) even a quantum world would have a fully
deterministic description.Comment: 64 pages, 27 figure
Quantum Random Number Generators
Random numbers are a fundamental resource in science and engineering with
important applications in simulation and cryptography. The inherent randomness
at the core of quantum mechanics makes quantum systems a perfect source of
entropy. Quantum random number generation is one of the most mature quantum
technologies with many alternative generation methods. We discuss the different
technologies in quantum random number generation from the early devices based
on radioactive decay to the multiple ways to use the quantum states of light to
gather entropy from a quantum origin. We also discuss randomness extraction and
amplification and the notable possibility of generating trusted random numbers
even with untrusted hardware using device independent generation protocols.Comment: Review paper on Quantum Random Number Generators. Second version.
Errors corrected. Expanded sections on entropy estimation, randomness
extraction and quantum randomness expansion and amplification. Comments
welcom
Universal Codes as a Basis for Time Series Testing
We suggest a new approach to hypothesis testing for ergodic and stationary
processes. In contrast to standard methods, the suggested approach gives a
possibility to make tests, based on any lossless data compression method even
if the distribution law of the codeword lengths is not known. We apply this
approach to the following four problems: goodness-of-fit testing (or identity
testing), testing for independence, testing of serial independence and
homogeneity testing and suggest nonparametric statistical tests for these
problems. It is important to note that practically used so-called archivers can
be used for suggested testing.Comment: accepted for "Statistical Methodology" (Elsevier
A Practical Approach to Lossy Joint Source-Channel Coding
This work is devoted to practical joint source channel coding. Although the
proposed approach has more general scope, for the sake of clarity we focus on a
specific application example, namely, the transmission of digital images over
noisy binary-input output-symmetric channels. The basic building blocks of most
state-of the art source coders are: 1) a linear transformation; 2) scalar
quantization of the transform coefficients; 3) probability modeling of the
sequence of quantization indices; 4) an entropy coding stage. We identify the
weakness of the conventional separated source-channel coding approach in the
catastrophic behavior of the entropy coding stage. Hence, we replace this stage
with linear coding, that maps directly the sequence of redundant quantizer
output symbols into a channel codeword. We show that this approach does not
entail any loss of optimality in the asymptotic regime of large block length.
However, in the practical regime of finite block length and low decoding
complexity our approach yields very significant improvements. Furthermore, our
scheme allows to retain the transform, quantization and probability modeling of
current state-of the art source coders, that are carefully matched to the
features of specific classes of sources. In our working example, we make use of
``bit-planes'' and ``contexts'' model defined by the JPEG2000 standard and we
re-interpret the underlying probability model as a sequence of conditionally
Markov sources. The Markov structure allows to derive a simple successive
coding and decoding scheme, where the latter is based on iterative Belief
Propagation. We provide a construction example of the proposed scheme based on
punctured Turbo Codes and we demonstrate the gain over a conventional separated
scheme by running extensive numerical experiments on test images.Comment: 51 pages, submitted to IEEE Transactions on Information Theor
Pseudorandomness and Combinatorial Constructions
In combinatorics, the probabilistic method is a very powerful tool to prove
the existence of combinatorial objects with interesting and useful properties.
Explicit constructions of objects with such properties are often very
difficult, or unknown. In computer science, probabilistic algorithms are
sometimes simpler and more efficient than the best known deterministic
algorithms for the same problem.
Despite this evidence for the power of random choices, the computational
theory of pseudorandomness shows that, under certain complexity-theoretic
assumptions, every probabilistic algorithm has an efficient deterministic
simulation and a large class of applications of the the probabilistic method
can be converted into explicit constructions.
In this survey paper we describe connections between the conditional
``derandomization'' results of the computational theory of pseudorandomness and
unconditional explicit constructions of certain combinatorial objects such as
error-correcting codes and ``randomness extractors.''Comment: Submitted to the Proceedings of ICM'06, Madri
Data Smashing
Investigation of the underlying physics or biology from empirical data
requires a quantifiable notion of similarity - when do two observed data sets
indicate nearly identical generating processes, and when they do not. The
discriminating characteristics to look for in data is often determined by
heuristics designed by experts, , distinct shapes of "folded" lightcurves
may be used as "features" to classify variable stars, while determination of
pathological brain states might require a Fourier analysis of brainwave
activity. Finding good features is non-trivial. Here, we propose a universal
solution to this problem: we delineate a principle for quantifying similarity
between sources of arbitrary data streams, without a priori knowledge, features
or training. We uncover an algebraic structure on a space of symbolic models
for quantized data, and show that such stochastic generators may be added and
uniquely inverted; and that a model and its inverse always sum to the generator
of flat white noise. Therefore, every data stream has an anti-stream: data
generated by the inverse model. Similarity between two streams, then, is the
degree to which one, when summed to the other's anti-stream, mutually
annihilates all statistical structure to noise. We call this data smashing. We
present diverse applications, including disambiguation of brainwaves pertaining
to epileptic seizures, detection of anomalous cardiac rhythms, and
classification of astronomical objects from raw photometry. In our examples,
the data smashing principle, without access to any domain knowledge, meets or
exceeds the performance of specialized algorithms tuned by domain experts
Computing Entropy Rate Of Symbol Sources & A Distribution-free Limit Theorem
Entropy rate of sequential data-streams naturally quantifies the complexity
of the generative process. Thus entropy rate fluctuations could be used as a
tool to recognize dynamical perturbations in signal sources, and could
potentially be carried out without explicit background noise characterization.
However, state of the art algorithms to estimate the entropy rate have markedly
slow convergence; making such entropic approaches non-viable in practice. We
present here a fundamentally new approach to estimate entropy rates, which is
demonstrated to converge significantly faster in terms of input data lengths,
and is shown to be effective in diverse applications ranging from the
estimation of the entropy rate of English texts to the estimation of complexity
of chaotic dynamical systems. Additionally, the convergence rate of entropy
estimates do not follow from any standard limit theorem, and reported
algorithms fail to provide any confidence bounds on the computed values.
Exploiting a connection to the theory of probabilistic automata, we establish a
convergence rate of as a
function of the input length , which then yields explicit
uncertainty estimates, as well as required data lengths to satisfy
pre-specified confidence bounds
Optimal algorithms for universal random number generation from finite memory sources
We study random number generators (RNGs), both in the fixed to
variable-length (FVR) and the variable to fixed-length (VFR) regimes, in a
universal setting in which the input is a finite memory source of arbitrary
order and unknown parameters, with arbitrary input and output (finite) alphabet
sizes. Applying the method of types, we characterize essentially unique optimal
universal RNGs that maximize the expected output (respectively, minimize the
expected input) length in the FVR (respectively, VFR) case. For the FVR case,
the RNG studied is a generalization of Elias's scheme, while in the VFR case
the general scheme is new. We precisely characterize, up to an additive
constant, the corresponding expected lengths, which include second-order terms
similar to those encountered in universal data compression and universal
simulation. Furthermore, in the FVR case, we consider also a "twice-universal"
setting, in which the Markov order k of the input source is also unknown.Comment: To appear in IEEE Transactions on Information Theor
Fundamentals of Computing
These are notes for the course CS-172 I first taught in the Fall 1986 at UC
Berkeley and subsequently at Boston University. The goal was to introduce the
undergraduates to basic concepts of Theory of Computation and to provoke their
interest in further study. Model-dependent effects were systematically ignored.
Concrete computational problems were considered only as illustrations of
general principles. The notes are skeletal: they do have (terse) proofs, but
exercises, references, intuitive comments, examples are missing or inadequate.
The notes can be used for designing a course or by students who want to refresh
the known material or are bright and have access to an instructor for
questions. Each subsection takes about a week of the course.Comment: 22 pages; extende
Proceedings of Workshop AEW10: Concepts in Information Theory and Communications
The 10th Asia-Europe workshop in "Concepts in Information Theory and
Communications" AEW10 was held in Boppard, Germany on June 21-23, 2017. It is
based on a longstanding cooperation between Asian and European scientists. The
first workshop was held in Eindhoven, the Netherlands in 1989. The idea of the
workshop is threefold: 1) to improve the communication between the scientist in
the different parts of the world; 2) to exchange knowledge and ideas; and 3) to
pay a tribute to a well respected and special scientist.Comment: 44 pages, editors for the proceedings: Yanling Chen and A. J. Han
Vinc
- …