Search CORE

1,024 research outputs found

An Information-Theoretic Test for Dependence with an Application to the Temporal Structure of Stock Returns

Author: Sher Galen
Vitoria Pedro
Publication venue
Publication date: 01/05/2013
Field of study

Information theory provides ideas for conceptualising information and measuring relationships between objects. It has found wide application in the sciences, but economics and finance have made surprisingly little use of it. We show that time series data can usefully be studied as information -- by noting the relationship between statistical redundancy and dependence, we are able to use the results of information theory to construct a test for joint dependence of random variables. The test is in the same spirit of those developed by Ryabko and Astola (2005, 2006b,a), but differs from these in that we add extra randomness to the original stochatic process. It uses data compression to estimate the entropy rate of a stochastic process, which allows it to measure dependence among sets of random variables, as opposed to the existing econometric literature that uses entropy and finds itself restricted to pairwise tests of dependence. We show how serial dependence may be detected in S&P500 and PSI20 stock returns over different sample periods and frequencies. We apply the test to synthetic data to judge its ability to recover known temporal dependence structures.Comment: 22 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Universal lossless source coding with the Burrows Wheeler transform

Author: Effros Michelle
Kulkarni Sanjeev R.
Verdú Sergio
Visweswariah Karthik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n → ∞, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory source

CiteSeerX

Caltech Authors

Less redundant codes for variable size dictionaries

Author: Rajpoot Nasir M.
Yao Zhen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

We report on a family of variable-length codes with less redundancy than the flat code used in most of the variable size dictionary-based compression methods. The length of codes belonging to this family is still bounded above by [log_2/ |D|] where |D| denotes the dictionary size. We describe three of these codes, namely, the balanced code, the phase-in-binary code (PB), and the depth-span code (DS). As the name implies, the balanced code is constructed by a height balanced tree, so it has the shortest average codeword length. The corresponding coding tree for the PB code has an interesting property that it is made of full binary phases, and thus the code can be computed efficiently using simple binary shifting operations. The DS coding tree is maintained in such a way that the coder always finds the longest extendable codeword and extends it until it reaches the maximum length. It is optimal with respect to the code-length contrast. The PB and balanced codes have almost similar improvements, around 3% to 7% which is very close to the relative redundancy in flat code. The DS code is particularly good in dealing with files with a large amount of redundancy, such as a running sequence of one symbol. We also did some empirical study on the codeword distribution in the LZW dictionary and proposed a scheme called dynamic block shifting (DBS) to further improve the codes' performance. Experiments suggest that the DBS is helpful in compressing random sequences. From an application point of view, PB code with DBS is recommended for general practical usage

CiteSeerX

Warwick Research Archives Portal Repository

One-pass adaptive universal vector quantization

Author: Chou P. A.
Effros M.
Gray R. M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

The authors introduce a one-pass adaptive universal quantization technique for real, bounded alphabet, stationary sources. The algorithm is set on line without any prior knowledge of the statistics of the sources which it might encounter and asymptotically achieves ideal performance on all sources that it sees. The system consists of an encoder and a decoder. At increasing intervals, the encoder refines its codebook using knowledge about incoming data symbols. This codebook is then described to the decoder in the form of updates on the previous codebook. The accuracy to which the codebook is described increases as the number of symbols seen, and thus the accuracy to which the codebook is known, grows

Caltech Authors

Maximum Entropy Production Principle for Stock Returns

Author: Fiedor Paweł
Publication venue
Publication date: 16/08/2014
Field of study

In our previous studies we have investigated the structural complexity of time series describing stock returns on New York's and Warsaw's stock exchanges, by employing two estimators of Shannon's entropy rate based on Lempel-Ziv and Context Tree Weighting algorithms, which were originally used for data compression. Such structural complexity of the time series describing logarithmic stock returns can be used as a measure of the inherent (model-free) predictability of the underlying price formation processes, testing the Efficient-Market Hypothesis in practice. We have also correlated the estimated predictability with the profitability of standard trading algorithms, and found that these do not use the structure inherent in the stock returns to any significant degree. To find a way to use the structural complexity of the stock returns for the purpose of predictions we propose the Maximum Entropy Production Principle as applied to stock returns, and test it on the two mentioned markets, inquiring into whether it is possible to enhance prediction of stock returns based on the structural complexity of these and the mentioned principle.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

A Progressive Universal Noiseless Coder

Author: Eve A. Riskin
Michelle Effros
Philip A. Chou
Robert M. Gray
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

The authors combine pruned tree-structured vector quantization (pruned TSVQ) with Itoh's (1987) universal noiseless coder. By combining pruned TSVQ with universal noiseless coding, they benefit from the “successive approximation” capabilities of TSVQ, thereby allowing progressive transmission of images, while retaining the ability to noiselessly encode images of unknown statistics in a provably asymptotically optimal fashion. Noiseless compression results are comparable to Ziv-Lempel and arithmetic coding for both images and finely quantized Gaussian sources

CiteSeerX

Caltech Authors