Search CORE

133 research outputs found

MDL Denoising Revisited

Author: Myllymäki Petri
Rissanen Jorma
Roos Teemu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/09/2006
Field of study

We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

NML Computation Algorithms for Tree-Structured Multinomial Bayesian Networks

Author: Kontkanen Petri
Myllymäki Petri
Wettig Hannes
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, tree-structured Bayesian networks

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On the behavior of MDL denoising

Author: Myllymäki Petri
Roos Teemu
Tirri Henry
Publication venue
Publication date: 01/01/2005
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Proceedings of the Fifth European Workshop on Probabilistic Graphical Models

Author: Jaakkola Tommi
Myllymäki Petri
Roos Teemu
Publication venue
Publication date: 01/09/2010
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Mapping Bayesian Networks to Stochastic Neural Networks : A Foundation for Hybrid Bayesian-Neural Systems

Author: Myllymäki Petri
Publication venue: Helsingfors universitet
Publication date: 01/12/1995
Field of study

Helsingin yliopiston digitaalinen arkisto

Learning Locally Minimax Optimal Bayesian Networks

Author: Petri Myllymäki
Teemu Roos
Tomi Silander
Publication venue
Publication date: 24/03/2010
Field of study

We consider the problem of learning Bayesian network models in a non-informative setting, where the only available information is a set of observational data, and no background knowledge is available. The problem can be divided into two different subtasks: learning the structure of the network (a set of independence relations), and learning the parameters of the model (that fix the probability distribution from the set of all distributions consistent with the chosen structure). There are not many theoretical frameworks that consistently handle both these problems together, the Bayesian framework being an exception. In this paper we propose an alternative, information-theoretic framework which sidesteps some of the technical problems facing the Bayesian approach. The framework is based on the minimax-optimal Normalized Maximum Likelihood (NML) distribution, which is motivated by the Minimum Description Length (MDL) principle. The resulting model selection criterion is consistent, and it provides a way to construct highly predictive Bayesian network models. Our empirical tests show that the proposed method compares favorably with alternative approaches in both model selection and prediction tasks.

CiteSeerX

Elsevier - Publisher Connector

NEULA: a hybrid neural-symbolic expert system shell

Author: Floreen Patrik
Myllymäki Petri
Orponen Pekka
Tirri Henri
Publication venue
Publication date: 01/01/1992
Field of study

Non peer reviewe

CiteSeerX

Helsingin yliopiston digitaalinen arkisto

Comparison of NML and Bayesian scoring criteria for learning parsimonious Markov models

Author: Eggeling Ralf
Grosse Ivo
Myllymäki Petri
Roos Teemu Teppo
Publication venue
Publication date: 01/01/2012
Field of study

Parsimonious Markov models, a generalization of variable order Markov models, have been recently introduced for modeling biological sequences. Up to now, they have been learned by Bayesian approaches. However, there is not always sufficient prior knowledge available and a fully uninformative prior is difficult to define. In order to avoid cumbersome cross validation procedures for obtaining the optimal prior choice, we here adapt scoring criteria for Bayesian networks that approximate the Normalized Maximum Likelihood (NML) to parsimonious Markov models. We empirically compare their performance with the Bayesian approach by classifying splice sites, an important problem from computational biology.Non peer reviewe

Helsingin yliopiston digitaalinen arkisto

AS-ASL: Algorithm Selection with Auto-sklearn

Author: Järvisalo Matti
Kangas Kustaa
Koivisto Mikko
Malone Brandon
Myllymäki Petri
Publication venue
Publication date: 01/01/2017
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Empirical Hardness of Finding Optimal Bayesian Network Structures: Algorithm Selection and Runtime Prediction

Author: Järvisalo Matti
Kangas Kustaa
Koivisto Mikko
Malone Brandon
Myllymäki Petri
Publication venue
Publication date: 20/12/2017
Field of study

Various algorithms have been proposed for finding a Bayesian network structure that is guaranteed to maximize a given scoring function. Implementations of state-of-the-art algorithms, solvers, for this Bayesian network structure learning problem rely on adaptive search strategies, such as branch-and-bound and integer linear programming techniques. Thus, the time requirements of the solvers are not well characterized by simple functions of the instance size. Furthermore, no single solver dominates the others in speed. Given a problem instance, it is thus a priori unclear which solver will perform best and how fast it will solve the instance. We show that for a given solver the hardness of a problem instance can be efficiently predicted based on a collection of non-trivial features which go beyond the basic parameters of instance size. Specifically, we train and test statistical models on empirical data, based on the largest evaluation of state-of-the-art exact solvers to date. We demonstrate that we can predict the runtimes to a reasonable degree of accuracy. These predictions enable effective selection of solvers that perform well in terms of runtimes on a particular instance. Thus, this work contributes a highly efficient portfolio solver that makes use of several individual solvers.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto