Search CORE

1,066 research outputs found

Efficient algorithms for decision tree cross-validation

Author: Blockeel Hendrik
Struyf Jan
Publication venue
Publication date: 01/01/2001
Field of study

Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. The analysis is supported by experimental results.Comment: 9 pages, 6 figures. http://www.cs.kuleuven.ac.be/cgi-bin-dtai/publ_info.pl?id=3478

arXiv.org e-Print Archive

Lirias

CiteSeerX

First-Order Decomposition Trees

Author: Blockeel Hendrik
Davis Jesse
Taghipour Nima
Publication venue
Publication date: 04/06/2013
Field of study

Lifting attempts to speed up probabilistic inference by exploiting symmetries in the model. Exact lifted inference methods, like their propositional counterparts, work by recursively decomposing the model and the problem. In the propositional case, there exist formal structures, such as decomposition trees (dtrees), that represent such a decomposition and allow us to determine the complexity of inference a priori. However, there is currently no equivalent structure nor analogous complexity results for lifted inference. In this paper, we introduce FO-dtrees, which upgrade propositional dtrees to the first-order level. We show how these trees can characterize a lifted inference solution for a probabilistic logical model (in terms of a sequence of lifted operations), and make a theoretical analysis of the complexity of lifted inference in terms of the novel notion of lifted width for the tree

arXiv.org e-Print Archive

Lirias

CiteSeerX

A Revised Publication Model for ECML PKDD

Author: Blockeel Hendrik
Kersting Kristian
Nijssen Siegfried
Zelezny Filip
Publication venue
Publication date: 01/01/2012
Field of study

ECML PKDD is the main European conference on machine learning and data mining. Since its foundation it implemented the publication model common in computer science: there was one conference deadline; conference submissions were reviewed by a program committee; papers were accepted with a low acceptance rate. Proceedings were published in several Springer Lecture Notes in Artificial (LNAI) volumes, while selected papers were invited to special issues of the Machine Learning and Data Mining and Knowledge Discovery journals. In recent years, this model has however come under stress. Problems include: reviews are of highly variable quality; the purpose of bringing the community together is lost; reviewing workloads are high; the information content of conferences and journals decreases; there is confusion among scientists in interdisciplinary contexts. In this paper, we present a new publication model, which will be adopted for the ECML PKDD 2013 conference, and aims to solve some of the problems of the traditional model. The key feature of this model is the creation of a journal track, which is open to submissions all year long and allows for revision cycles.Comment: 13 page

arXiv.org e-Print Archive

CiteSeerX

DIAL UCLouvain

Experiment Databases: Creating a New Platform for Meta-Learning Research

Author: Blockeel Hendrik
Holmes Geoffrey
Pfahringer Bernhard
Vanschoren Joaquin
Publication venue: 'University of Porto'
Publication date: 01/01/2008
Field of study

Many studies in machine learning try to investigate what makes an algorithm succeed or fail on certain datasets. However, the field is still evolving relatively quickly, and new algorithms, preprocessing methods, learning tasks and evaluation procedures continue to emerge in the literature. Thus, it is impossible for a single study to cover this expanding space of learning approaches. In this paper, we propose a community-based approach for the analysis of learning algorithms, driven by sharing meta-data from previous experiments in a uniform way. We illustrate how organizing this information in a central database can create a practical public platform for any kind of exploitation of meta-knowledge, allowing effective reuse of previous experimentation and targeted analysis of the collected results

Lirias

CiteSeerX

Research Commons@Waikato