Search CORE

3 research outputs found

Pruning Rules for Learning Parsimonious Context Trees

Author: Eggeling Ralf
Koivisto Mikko Kalle Henrik
Publication venue: AUAI Press
Publication date: 01/01/2016
Field of study

We give a novel algorithm for finding a parsimonious context tree (PCT) that best fits a given data set. PCTs extend traditional context trees by allowing context-specific grouping of the states of a context variable, also enabling skipping the variable. However, they gain statistical efficiency at the cost of computational efficiency, as the search space of PCTs is of tremendous size. We propose pruning rules based on efficiently computable score upper bounds with the aim of reducing this search space significantly. While our concrete bounds exploit properties of the BIC score, the ideas apply also to other scoring functions. Empirical results show that our algorithm is typically an order-of-magnitude faster than a recently proposed memory-intensive algorithm, or alternatively, about equally fast but using dramatically less memory.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Algorithms for learning parsimonious context trees

Author: A Dempster
A Sandelin
C Campos de
D Bertsimas
D Heckerman
D Hush
E Peterson
F Leonardi
F Ordonéz
F Wilcoxon
G Blanchard
G Schwarz
H Akaike
H Chipman
I Ben-Gal
Ivo Grosse
J Bacardit
J Grau
J Rissanen
J Smith
JR Quinlan
L Breiman
L Brocchieri
M Jaeger
M Seifert
Mikko Koivisto
P Bühlmann
R Begleiter
R Eggeling
R Eggeling
R Eggeling
Ralf Eggeling
S Lomax
S Nielsen
T Li
T Schneider
T Silander
The UniProt Consortium
W Buntine
X Zhao
Y Orenstein
Publication venue
Publication date: 01/06/2019
Field of study

Parsimonious context trees, PCTs, provide a sparse parameterization of conditional probability distributions. They are particularly powerful for modeling context-specific independencies in sequential discrete data. Learning PCTs from data is computationally hard due to the combinatorial explosion of the space of model structures as the number of predictor variables grows. Under the score-and-search paradigm, the fastest algorithm for finding an optimal PCT, prior to the present work, is based on dynamic programming. While the algorithm can handle small instances fast, it becomes infeasible already when there are half a dozen four-state predictor variables. Here, we show that common scoring functions enable the use of new algorithmic ideas, which can significantly expedite the dynamic programming algorithm on typical data. Specifically, we introduce a memoization technique, which exploits regularities within the predictor variables by equating different contexts associated with the same data subset, and a bound-and-prune technique, which exploits regularities within the response variable by pruning parts of the search space based on score upper bounds. On real-world data from recent applications of PCTs within computational biology the ideas are shown to reduce the traversed search space and the computation time by several orders of magnitude in typical cases.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Learning Bayesian networks with local structure, mixed variables, and exact algorithms

Author: Eggeling Ralf
Koivisto Mikko
Talvitie Topi
Publication venue
Publication date: 01/12/2019
Field of study

Modern exact algorithms for structure learning in Bayesian networks first compute an exact local score of every candidate parent set, and then find a network structure by combinatorial optimization so as to maximize the global score. This approach assumes that each local score can be computed fast, which can be problematic when the scarcity of the data calls for structured local models or when there are both continuous and discrete variables, for these cases have lacked efficient-to-compute local scores. To address this challenge, we introduce a local score that is based on a class of classification and regression trees. We show that under modest restrictions on the possible branchings in the tree structure, it is feasible to find a structure that maximizes a Bayes score in a range of moderate-size problem instances. In particular, this enables global optimization of the Bayesian network structure, including the local structure. In addition, we introduce a related model class that extends ordinary conditional probability tables to continuous variables by employing an adaptive discretization approach. The two model classes are compared empirically by learning Bayesian networks from benchmark real-world and synthetic data sets. We discuss the relative strengths of the model classes in terms of their structure learning capability, predictive performance, and running time. (C) 2019 The Authors. Published by Elsevier Inc.Peer reviewe

Helsingin yliopiston digitaalinen arkisto