Search CORE

1,907 research outputs found

Causal Inference by Stochastic Complexity

Author: Budhathoki Kailash
Vreeken Jilles
Publication venue
Publication date: 01/01/2017
Field of study

The algorithmic Markov condition states that the most likely causal direction between two random variables X and Y can be identified as that direction with the lowest Kolmogorov complexity. Due to the halting problem, however, this notion is not computable. We hence propose to do causal inference by stochastic complexity. That is, we propose to approximate Kolmogorov complexity via the Minimum Description Length (MDL) principle, using a score that is mini-max optimal with regard to the model class under consideration. This means that even in an adversarial setting, such as when the true distribution is not in this class, we still obtain the optimal encoding for the data relative to the class. We instantiate this framework, which we call CISC, for pairs of univariate discrete variables, using the class of multinomial distributions. Experiments show that CISC is highly accurate on synthetic, benchmark, as well as real-world data, outperforming the state of the art by a margin, and scales extremely well with regard to sample and domain sizes

arXiv.org e-Print Archive

MPG.PuRe

Determining Principal Component Cardinality through the Principle of Minimum Description Length

Author: A Blumer
AJ Donald
AP Dawid
AR Barron
C Eckart
DC Hoyle
IT Jolliffe
J Josse
J Rissanen
J Rissanen
J Rissanen
J Rissanen
JI Myung
M Mitzenmacher
M Zhu
MH Hansen
T Hastie
TM Cover
Y Choi
Publication venue
Publication date: 29/06/2019
Field of study

PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201

arXiv.org e-Print Archive

Crossref

Results on the Redundancy of Universal Compression for Finite-Length Sequences

Author: Beirami Ahmad
Fekri Faramarz
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we investigate the redundancy of universal coding schemes on smooth parametric sources in the finite-length regime. We derive an upper bound on the probability of the event that a sequence of length

n

, chosen using Jeffreys' prior from the family of parametric sources with

d

unknown parameters, is compressed with a redundancy smaller than

(1-\epsilon)\frac{d}{2}\log n

for any

\epsilon>0

. Our results also confirm that for large enough

n

and

d

, the average minimax redundancy provides a good estimate for the redundancy of most sources. Our result may be used to evaluate the performance of universal source coding schemes on finite-length sequences. Additionally, we precisely characterize the minimax redundancy for two--stage codes. We demonstrate that the two--stage assumption incurs a negligible redundancy especially when the number of source parameters is large. Finally, we show that the redundancy is significant in the compression of small sequences.Comment: accepted in the 2011 IEEE International Symposium on Information Theory (ISIT 2011

arXiv.org e-Print Archive

CiteSeerX

Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers

Author: Lin Baihan
Publication venue
Publication date: 10/09/2020
Field of study

Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, UAM constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrated the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, UAM tracks dependency and critical learning stages across layers and recurrent time steps of deep networks

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

MDL Denoising Revisited

Author: Myllymäki Petri
Rissanen Jorma
Roos Teemu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2006
Field of study

We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Causal Inference by Stochastic Complexity

Author: Budhathoki K.
Vreeken J.
Publication venue
Publication date: 01/01/2017
Field of study

MPG.PuRe

Minimum Description Length Revisited

Author: Grünwald P.D. (Peter)
Roos T. (Teemu)
Publication venue
Publication date: 22/11/2019
Field of study

This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model selection and averaging and hypothesis testing, as well as the first completely general definition of {\em MDL estimators}. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC vs BIC and cross-validation vs Bayes can, to a large extent, be viewed from a unified perspective

CWI's Institutional Repository