5,330 research outputs found
Predicting a binary sequence almost as well as the optimal biased coin
AbstractWe apply the exponential weight algorithm, introduced and Littlestone and Warmuth [26] and by Vovk [35] to the problem of predicting a binary sequence almost as well as the best biased coin. We first show that for the case of the logarithmic loss, the derived algorithm is equivalent to the Bayes algorithm with Jeffrey’s prior, that was studied by Xie and Barron [38] under probabilistic assumptions. We derive a uniform bound on the regret which holds for any sequence. We also show that if the empirical distribution of the sequence is bounded away from 0 and from 1, then, as the length of the sequence increases to infinity, the difference between this bound and a corresponding bound on the average case regret of the same algorithm (which is asymptotically optimal in that case) is only 1/2. We show that this gap of 1/2 is necessary by calculating the regret of the min–max optimal algorithm for this problem and showing that the asymptotic upper bound is tight. We also study the application of this algorithm to the square loss and show that the algorithm that is derived in this case is different from the Bayes algorithm and is better than it for prediction in the worst-case
Issue Framing in Online Discussion Fora
In online discussion fora, speakers often make arguments for or against
something, say birth control, by highlighting certain aspects of the topic. In
social science, this is referred to as issue framing. In this paper, we
introduce a new issue frame annotated corpus of online discussions. We explore
to what extent models trained to detect issue frames in newswire and social
media can be transferred to the domain of discussion fora, using a combination
of multi-task and adversarial training, assuming only unlabeled training data
in the target domain.Comment: To appear in NAACL-HLT 201
Absolutely No Free Lunches!
This paper is concerned with learners who aim to learn patterns in infinite
binary sequences: shown longer and longer initial segments of a binary
sequence, they either attempt to predict whether the next bit will be a 0 or
will be a 1 or they issue forecast probabilities for these events. Several
variants of this problem are considered. In each case, a no-free-lunch result
of the following form is established: the problem of learning is a formidably
difficult one, in that no matter what method is pursued, failure is
incomparably more common that success; and difficult choices must be faced in
choosing a method of learning, since no approach dominates all others in its
range of success. In the simplest case, the comparison of the set of situations
in which a method fails and the set of situations in which it succeeds is a
matter of cardinality (countable vs. uncountable); in other cases, it is a
topological matter (meagre vs. co-meagre) or a hybrid computational-topological
matter (effectively meagre vs. effectively co-meagre)
Occam's Quantum Strop: Synchronizing and Compressing Classical Cryptic Processes via a Quantum Channel
A stochastic process's statistical complexity stands out as a fundamental
property: the minimum information required to synchronize one process generator
to another. How much information is required, though, when synchronizing over a
quantum channel? Recent work demonstrated that representing causal similarity
as quantum state-indistinguishability provides a quantum advantage. We
generalize this to synchronization and offer a sequence of constructions that
exploit extended causal structures, finding substantial increase of the quantum
advantage. We demonstrate that maximum compression is determined by the
process's cryptic order---a classical, topological property closely allied to
Markov order, itself a measure of historical dependence. We introduce an
efficient algorithm that computes the quantum advantage and close noting that
the advantage comes at a cost---one trades off prediction for generation
complexity.Comment: 10 pages, 6 figures;
http://csc.ucdavis.edu/~cmg/compmech/pubs/oqs.ht
Estimating the Algorithmic Complexity of Stock Markets
Randomness and regularities in Finance are usually treated in probabilistic
terms. In this paper, we develop a completely different approach in using a
non-probabilistic framework based on the algorithmic information theory
initially developed by Kolmogorov (1965). We present some elements of this
theory and show why it is particularly relevant to Finance, and potentially to
other sub-fields of Economics as well. We develop a generic method to estimate
the Kolmogorov complexity of numeric series. This approach is based on an
iterative "regularity erasing procedure" implemented to use lossless
compression algorithms on financial data. Examples are provided with both
simulated and real-world financial time series. The contributions of this
article are twofold. The first one is methodological : we show that some
structural regularities, invisible with classical statistical tests, can be
detected by this algorithmic method. The second one consists in illustrations
on the daily Dow-Jones Index suggesting that beyond several well-known
regularities, hidden structure may in this index remain to be identified
Inference by Believers in the Law of Small Numbers
Many people believe in the "Law of Small Numbers," exaggerating the degree to which a small sample resembles the population from which it is drawn. To model this, I assume that a person exaggerates the likelihood that a short sequence of i.i.d. signals resembles the long-run rate at which those signals are generated. Such a person believes in the "gambler's fallacy", thinking early draws of one signal increase the odds of next drawing other signals. When uncertain about the rate, the person over-infers from short sequences of signals, and is prone to think the rate is more extreme than it is. When the person makes inferences about the frequency at which rates are generated by different sources -- such as the distribution of talent among financial analysts -- based on few observations from each source, he tends to exaggerate how much variance there is in the rates. Hence, the model predicts that people may pay for financial advice from "experts" whose expertise is entirely illusory. Other economic applications are discussed.
A Philosophical Treatise of Universal Induction
Understanding inductive reasoning is a problem that has engaged mankind for
thousands of years. This problem is relevant to a wide range of fields and is
integral to the philosophy of science. It has been tackled by many great minds
ranging from philosophers to scientists to mathematicians, and more recently
computer scientists. In this article we argue the case for Solomonoff
Induction, a formal inductive framework which combines algorithmic information
theory with the Bayesian framework. Although it achieves excellent theoretical
results and is based on solid philosophical foundations, the requisite
technical knowledge necessary for understanding this framework has caused it to
remain largely unknown and unappreciated in the wider scientific community. The
main contribution of this article is to convey Solomonoff induction and its
related concepts in a generally accessible form with the aim of bridging this
current technical gap. In the process we examine the major historical
contributions that have led to the formulation of Solomonoff Induction as well
as criticisms of Solomonoff and induction in general. In particular we examine
how Solomonoff induction addresses many issues that have plagued other
inductive systems, such as the black ravens paradox and the confirmation
problem, and compare this approach with other recent approaches.Comment: 72 pages, 2 figures, 1 table, LaTe
Asymptotic theorems of sequential estimation-adjusted urn models
The Generalized P\'{o}lya Urn (GPU) is a popular urn model which is widely
used in many disciplines. In particular, it is extensively used in treatment
allocation schemes in clinical trials. In this paper, we propose a sequential
estimation-adjusted urn model (a nonhomogeneous GPU) which has a wide spectrum
of applications. Because the proposed urn model depends on sequential
estimations of unknown parameters, the derivation of asymptotic properties is
mathematically intricate and the corresponding results are unavailable in the
literature. We overcome these hurdles and establish the strong consistency and
asymptotic normality for both the patient allocation and the estimators of
unknown parameters, under some widely satisfied conditions. These properties
are important for statistical inferences and they are also useful for the
understanding of the urn limiting process. A superior feature of our proposed
model is its capability to yield limiting treatment proportions according to
any desired allocation target. The applicability of our model is illustrated
with a number of examples.Comment: Published at http://dx.doi.org/10.1214/105051605000000746 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
- …