Search CORE

3 research outputs found

The Deluge of Spurious Correlations in Big Data

Author: Calude Cristian,
Longo Giuseppe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2015
Field of study

International audienceVery large databases are a ma jor opp ortunity for science and data analytics is a remarkable new field of investigation in computer science. The effectiveness of these toolsis used to support a “philosophy” against the scientific method as developed throughout history. According to this view, computer-discovered correlations should replace understanding and guide prediction and action. Consequently, there will be no need to givescientific meaning to phenomena, by proposing, say, causal relations, since regularities in very large databases are enough: “with enough data, the numbers speak for themselves”. The “end of science” is proclaimed. Using classical results from ergodic theory, Ramsey theory and algorithmic information theory, we show that this “philosophy” is wrong. For example, we prove that very large databases have to contain arbitrary correlations. These correlations appear only due to the size, not the nature, of data. They can be found in “randomly” generated, large enough databases, which - as we will prove - implies that most correlations are spurious. Too much information tends to behave like very little information. The scientific method can be enriched by computer mining in immense databases, but not replaced by it

HAL Descartes

Hal-Diderot

Kolmogorov complexity and computably enumerable sets

Author: Barmpalias George
Li Angsheng
Publication venue
Publication date: 01/01/2013
Field of study

We study the computably enumerable sets in terms of the: (a) Kolmogorov complexity of their initial segments; (b) Kolmogorov complexity of finite programs when they are used as oracles. We present an extended discussion of the existing research on this topic, along with recent developments and open problems. Besides this survey, our main original result is the following characterization of the computably enumerable sets with trivial initial segment prefix-free complexity. A computably enumerable set

A

K

-trivial if and only if the family of sets with complexity bounded by the complexity of

A

is uniformly computable from the halting problem

arXiv.org e-Print Archive

Institute Of Software, Chinese Academy Of Sciences

Kolmogorov complexity of enumerating finite sets

Author: Vereshchagin Nikolay Konstantinovich
Publication venue: 'Elsevier BV'
Publication date: 01/06/2007
Field of study

In this paper, we show that the constant 3 in Solovay's inequality, relating the negative logarithm of the a priori probability and Kolmogorov complexity for the problems of enumerating finite sets, can be replaced by the constant 2

CWI's Institutional Repository