2,282 research outputs found
Discrete MDL Predicts in Total Variation
The Minimum Description Length (MDL) principle selects the model that has the
shortest code for data plus model. We show that for a countable class of
models, MDL predictions are close to the true distribution in a strong sense.
The result is completely general. No independence, ergodicity, stationarity,
identifiability, or other assumption on the model class need to be made. More
formally, we show that for any countable class of models, the distributions
selected by MDL (or MAP) asymptotically predict (merge with) the true measure
in the class in total variation distance. Implications for non-i.i.d. domains
like time-series forecasting, discriminative learning, and reinforcement
learning are discussed.Comment: 15 LaTeX page
Offline to Online Conversion
We consider the problem of converting offline estimators into an online
predictor or estimator with small extra regret. Formally this is the problem of
merging a collection of probability measures over strings of length 1,2,3,...
into a single probability measure over infinite sequences. We describe various
approaches and their pros and cons on various examples. As a side-result we
give an elementary non-heuristic purely combinatoric derivation of Turing's
famous estimator. Our main technical contribution is to determine the
computational complexity of online estimators with good guarantees in general.Comment: 20 LaTeX page
Indefinitely Oscillating Martingales
We construct a class of nonnegative martingale processes that oscillate
indefinitely with high probability. For these processes, we state a uniform
rate of the number of oscillations and show that this rate is asymptotically
close to the theoretical upper bound. These bounds on probability and
expectation of the number of upcrossings are compared to classical bounds from
the martingale literature. We discuss two applications. First, our results
imply that the limit of the minimum description length operator may not exist.
Second, we give bounds on how often one can change one's belief in a given
hypothesis when observing a stream of data.Comment: ALT 2014, extended technical repor
Applying MDL to Learning Best Model Granularity
The Minimum Description Length (MDL) principle is solidly based on a provably
ideal method of inference using Kolmogorov complexity. We test how the theory
behaves in practice on a general problem in model selection: that of learning
the best model granularity. The performance of a model depends critically on
the granularity, for example the choice of precision of the parameters. Too
high precision generally involves modeling of accidental noise and too low
precision may lead to confusion of models that should be distinguished. This
precision is often determined ad hoc. In MDL the best model is the one that
most compresses a two-part code of the data set: this embodies ``Occam's
Razor.'' In two quite different experimental settings the theoretical value
determined using MDL coincides with the best value found experimentally. In the
first experiment the task is to recognize isolated handwritten characters in
one subject's handwriting, irrespective of size and orientation. Based on a new
modification of elastic matching, using multiple prototypes per character, the
optimal prediction rate is predicted for the learned parameter (length of
sampling interval) considered most likely by MDL, which is shown to coincide
with the best value found experimentally. In the second experiment the task is
to model a robot arm with two degrees of freedom using a three layer
feed-forward neural network where we need to determine the number of nodes in
the hidden layer giving best modeling performance. The optimal model (the one
that extrapolizes best on unseen examples) is predicted for the number of nodes
in the hidden layer considered most likely by MDL, which again is found to
coincide with the best value found experimentally.Comment: LaTeX, 32 pages, 5 figures. Artificial Intelligence journal, To
appea
On noise processes and limits of performance in biosensors
In this paper, we present a comprehensive stochastic model describing the measurement uncertainty, output signal, and limits of detection of affinity-based biosensors. The biochemical events within the biosensor platform are modeled by a Markov stochastic process, describing both the probabilistic mass transfer and the interactions of analytes with the capturing probes. To generalize this model and incorporate the detection process, we add noisy signal transduction and amplification stages to the Markov model. Using this approach, we are able to evaluate not only the output signal and the statistics of its fluctuation but also the noise contributions of each stage within the biosensor platform. Furthermore, we apply our formulations to define the signal-to-noise ratio, noise figure, and detection dynamic range of affinity-based biosensors. Motivated by the platforms encountered in practice, we construct the noise model of a number of widely used systems. The results of this study show that our formulations predict the behavioral characteristics of affinity-based biosensors which indicate the validity of the model
- …