31,281 research outputs found
Algorithmic and Statistical Perspectives on Large-Scale Data Analysis
In recent years, ideas from statistics and scientific computing have begun to
interact in increasingly sophisticated and fruitful ways with ideas from
computer science and the theory of algorithms to aid in the development of
improved worst-case algorithms that are useful for large-scale scientific and
Internet data analysis problems. In this chapter, I will describe two recent
examples---one having to do with selecting good columns or features from a (DNA
Single Nucleotide Polymorphism) data matrix, and the other having to do with
selecting good clusters or communities from a data graph (representing a social
or information network)---that drew on ideas from both areas and that may serve
as a model for exploiting complementary algorithmic and statistical
perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors,
"Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201
Estimating the Algorithmic Complexity of Stock Markets
Randomness and regularities in Finance are usually treated in probabilistic
terms. In this paper, we develop a completely different approach in using a
non-probabilistic framework based on the algorithmic information theory
initially developed by Kolmogorov (1965). We present some elements of this
theory and show why it is particularly relevant to Finance, and potentially to
other sub-fields of Economics as well. We develop a generic method to estimate
the Kolmogorov complexity of numeric series. This approach is based on an
iterative "regularity erasing procedure" implemented to use lossless
compression algorithms on financial data. Examples are provided with both
simulated and real-world financial time series. The contributions of this
article are twofold. The first one is methodological : we show that some
structural regularities, invisible with classical statistical tests, can be
detected by this algorithmic method. The second one consists in illustrations
on the daily Dow-Jones Index suggesting that beyond several well-known
regularities, hidden structure may in this index remain to be identified
- …