Search CORE

79 research outputs found

A tutorial introduction to the minimum description length principle

Author: Grunwald Peter
Publication venue
Publication date: 04/06/2004
Field of study

This tutorial provides an overview of and introduction to Rissanen's Minimum Description Length (MDL) Principle. The first chapter provides a conceptual, entirely non-technical introduction to the subject. It serves as a basis for the technical introduction given in the second chapter, in which all the ideas of the first chapter are made mathematically precise. The main ideas are discussed in great conceptual and technical detail. This tutorial is an extended version of the first two chapters of the collection "Advances in Minimum Description Length: Theory and Application" (edited by P.Grunwald, I.J. Myung and M. Pitt, to be published by the MIT Press, Spring 2005).Comment: 80 pages 5 figures Report with 2 chapter

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

From Imitation to Prediction, Data Compression vs Recurrent Neural Networks for Natural Language Processing

Author: Argerich Luis
Laura Juan Andrés
Masi Gabriel
Publication venue
Publication date: 01/05/2017
Field of study

In recent studies [1][13][12] Recurrent Neural Networks were used for generative processes and their surprising performance can be explained by their ability to create good predictions. In addition, data compression is also based on predictions. What the problem comes down to is whether a data compressor could be used to perform as well as recurrent neural networks in natural language processing tasks. If this is possible,then the problem comes down to determining if a compression algorithm is even more intelligent than a neural network in specific tasks related to human language. In our journey we discovered what we think is the fundamental difference between a Data Compression Algorithm and a Recurrent Neural Network

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Servicio de Difusión de la Creación Intelectual

MDL Denoising Revisited

Author: Myllymäki Petri
Rissanen Jorma
Roos Teemu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2006
Field of study

We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

Improving the minimum description length inference of phrase-based translation models

Author: ABA Hassanat
CM Bishop
D Aha
D Wilson
DH Wolpert
N Lopes
N Lopes
N Lopes
S Salzberg
TM Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length (MDL) inference to estimate pattern recognition models for machine translation. MDL is a theoretically-sound approach whose empirical results are however below those of the state-of-the-art pipeline of training heuristics. We identify potential limitations of current MDL procedures and provide a practical approach to overcome them. Empirical results support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922

Crossref

RiuNet

Repositório Institucional do Instituto Politécnico da Guarda

Эвристический метод построения Байесовских сетей

Author: Бидюк Петр Иванович
Терентьев Александр Николаевич
Publication venue: Київ
Publication date: 01/01/2006
Field of study

У статті описується евристичний метод, що призначений для побудови топології дискретної мережі Байєса. Метод заснований на мінімізації ентропії в інформації. Для виявлення сильно пов'язаних вершин використовується значення обопільної інформації, а для вибору найкращої структури мережі функція опису мінімальною довжиною.The article describes a heuristic method for constructing the topology of a discrete Bayesian network. The method is based the minimizing level of entropy in the input datasets. For identification, the strongest relationships between nodes use mutual information metric. The best topology of Bayes network chooses via using the function of minimal description length.В статье описывается эвристический метод предназначенный для построения топологии дискретной сети Байеса. Метод основан на минимизации энтропии в информации. Для выявления сильно связанных вершин используется значение обоюдной информации, а для выбора наилучшей структуры сети функция описания минимальной длиной

Electronic Archive of Kyiv Polytechnic Institute