Search CORE

35,001 research outputs found

Algorithmic statistics: forty years later

Author: A Milovanov
A Milovanov
A Shen
A Shen
A Shen
AN Kolmogorov
CH Bennett
CS Wallace
F Mota
J Rissanen
L Antunes
L Antunes
L Bienvenu
L Bienvenu
L Levin
M Koppel
M Koppel
M Li
N Vereshchagin
N Vereshchagin
N Vereshchagin
NK Vereshchagin
NK Vereshchagin
P Gács
P Gács
PMB Vitányi
R Solomonoff
R Solomonoff
S Rooij de
T Cover
VV V’yugin
VV V’yugin
VV V’yugin
Publication venue
Publication date: 07/03/2017
Field of study

Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve. In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde

arXiv.org e-Print Archive

Crossref

Algorithmic statistics revisited

Author: A Romashchenko
A Shen
AA Muchnik
AA Muchnik
D Hammer
GJ Chaitin
J Rissanen
L Antunes
LA Levin
M Koppel
M Li
N Vereshchagin
N Vereshchagin
NK Vereshchagin
VV V’yugin
VV V’yugin
Publication venue
Publication date: 01/01/2015
Field of study

The mission of statistics is to provide adequate statistical hypotheses (models) for observed data. But what is an "adequate" model? To answer this question, one needs to use the notions of algorithmic information theory. It turns out that for every data string

x

one can naturally define "stochasticity profile", a curve that represents a trade-off between complexity of a model and its adequacy. This curve has four different equivalent definitions in terms of (1)~randomness deficiency, (2)~minimal description length, (3)~position in the lists of simple strings and (4)~Kolmogorov complexity with decompression time bounded by busy beaver function. We present a survey of the corresponding definitions and results relating them to each other

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot

Around Kolmogorov complexity: basic notions and results

Author: A Nies
M Li
RG Downey
Publication venue
Publication date: 01/01/2015
Field of study

Algorithmic information theory studies description complexity and randomness and is now a well known field of theoretical computer science and mathematical logic. There are several textbooks and monographs devoted to this theory where one can find the detailed exposition of many difficult results as well as historical references. However, it seems that a short survey of its basic notions and main results relating these notions to each other, is missing. This report attempts to fill this gap and covers the basic notions of algorithmic information theory: Kolmogorov complexity (plain, conditional, prefix), Solomonoff universal a priori probability, notions of randomness (Martin-L\"of randomness, Mises--Church randomness), effective Hausdorff dimension. We prove their basic properties (symmetry of information, connection between a priori probability and prefix complexity, criterion of randomness in terms of complexity, complexity characterization for effective dimension) and show some applications (incompressibility method in computational complexity theory, incompleteness theorems). It is based on the lecture notes of a course at Uppsala University given by the author

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot

alphaCertified: certifying solutions to polynomial systems

Author: Hauenstein Jonathan D.
Sottile Frank
Publication venue
Publication date: 20/09/2011
Field of study

Smale's alpha-theory uses estimates related to the convergence of Newton's method to give criteria implying that Newton iterations will converge quadratically to solutions to a square polynomial system. The program alphaCertified implements algorithms based on alpha-theory to certify solutions to polynomial systems using both exact rational arithmetic and arbitrary precision floating point arithmetic. It also implements an algorithm to certify whether a given point corresponds to a real solution to a real polynomial system, as well as algorithms to heuristically validate solutions to overdetermined systems. Examples are presented to demonstrate the algorithms.Comment: 21 page

arXiv.org e-Print Archive

CiteSeerX

On Restricted Nonnegative Matrix Factorization

Author: Chistikov Dmitry
Kiefer Stefan
Marušić Ines
Shirmohammadi Mahsa
Worrell James
Publication venue
Publication date: 01/01/2016
Field of study

Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative

n \times m

matrix

M

into a product of a nonnegative

n \times d

matrix

W

and a nonnegative

d \times m

matrix

H

. Restricted NMF requires in addition that the column spaces of

M

and

W

coincide. Finding the minimal inner dimension

d

is known to be NP-hard, both for NMF and restricted NMF. We show that restricted NMF is closely related to a question about the nature of minimal probabilistic automata, posed by Paz in his seminal 1971 textbook. We use this connection to answer Paz's question negatively, thus falsifying a positive answer claimed in 1974. Furthermore, we investigate whether a rational matrix

M

always has a restricted NMF of minimal inner dimension whose factors

W

and

H

are also rational. We show that this holds for matrices

M

of rank at most

3

and we exhibit a rank-

4

matrix for which

W

and

H

require irrational entries.Comment: Full version of an ICALP'16 pape

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Oxford University Research Archive

Warwick Research Archives Portal Repository

Wireless Network Information Flow: A Deterministic Approach

Author: Avestimehr Salman
Diggavi Suhas
Tse David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2011
Field of study

In a wireless network with a single source and a single destination and an arbitrary number of relay nodes, what is the maximum rate of information flow achievable? We make progress on this long standing problem through a two-step approach. First we propose a deterministic channel model which captures the key wireless properties of signal strength, broadcast and superposition. We obtain an exact characterization of the capacity of a network with nodes connected by such deterministic channels. This result is a natural generalization of the celebrated max-flow min-cut theorem for wired networks. Second, we use the insights obtained from the deterministic analysis to design a new quantize-map-and-forward scheme for Gaussian networks. In this scheme, each relay quantizes the received signal at the noise level and maps it to a random Gaussian codeword for forwarding, and the final destination decodes the source's message based on the received signal. We show that, in contrast to existing schemes, this scheme can achieve the cut-set upper bound to within a gap which is independent of the channel parameters. In the case of the relay channel with a single relay as well as the two-relay Gaussian diamond network, the gap is 1 bit/s/Hz. Moreover, the scheme is universal in the sense that the relays need no knowledge of the values of the channel parameters to (approximately) achieve the rate supportable by the network. We also present extensions of the results to multicast networks, half-duplex networks and ergodic networks.Comment: To appear in IEEE transactions on Information Theory, Vol 57, No 4, April 201

arXiv.org e-Print Archive

Crossref

Prefix Codes for Power Laws with Countable Support

Author: Baer Michael B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2007
Field of study

In prefix coding over an infinite alphabet, methods that consider specific distributions generally consider those that decline more quickly than a power law (e.g., Golomb coding). Particular power-law distributions, however, model many random variables encountered in practice. For such random variables, compression performance is judged via estimates of expected bits per input symbol. This correspondence introduces a family of prefix codes with an eye towards near-optimal coding of known distributions. Compression performance is precisely estimated for well-known probability distributions using these codes and using previously known prefix codes. One application of these near-optimal codes is an improved representation of rational numbers.Comment: 5 pages, 2 tables, submitted to Transactions on Information Theor

arXiv.org e-Print Archive

Crossref