1,066 research outputs found
Efficient algorithms for decision tree cross-validation
Cross-validation is a useful and generally applicable technique often
employed in machine learning, including decision tree induction. An important
disadvantage of straightforward implementation of the technique is its
computational overhead. In this paper we show that, for decision trees, the
computational overhead of cross-validation can be reduced significantly by
integrating the cross-validation with the normal decision tree induction
process. We discuss how existing decision tree algorithms can be adapted to
this aim, and provide an analysis of the speedups these adaptations may yield.
The analysis is supported by experimental results.Comment: 9 pages, 6 figures.
http://www.cs.kuleuven.ac.be/cgi-bin-dtai/publ_info.pl?id=3478
First-Order Decomposition Trees
Lifting attempts to speed up probabilistic inference by exploiting symmetries
in the model. Exact lifted inference methods, like their propositional
counterparts, work by recursively decomposing the model and the problem. In the
propositional case, there exist formal structures, such as decomposition trees
(dtrees), that represent such a decomposition and allow us to determine the
complexity of inference a priori. However, there is currently no equivalent
structure nor analogous complexity results for lifted inference. In this paper,
we introduce FO-dtrees, which upgrade propositional dtrees to the first-order
level. We show how these trees can characterize a lifted inference solution for
a probabilistic logical model (in terms of a sequence of lifted operations),
and make a theoretical analysis of the complexity of lifted inference in terms
of the novel notion of lifted width for the tree
A Revised Publication Model for ECML PKDD
ECML PKDD is the main European conference on machine learning and data
mining. Since its foundation it implemented the publication model common in
computer science: there was one conference deadline; conference submissions
were reviewed by a program committee; papers were accepted with a low
acceptance rate. Proceedings were published in several Springer Lecture Notes
in Artificial (LNAI) volumes, while selected papers were invited to special
issues of the Machine Learning and Data Mining and Knowledge Discovery
journals. In recent years, this model has however come under stress. Problems
include: reviews are of highly variable quality; the purpose of bringing the
community together is lost; reviewing workloads are high; the information
content of conferences and journals decreases; there is confusion among
scientists in interdisciplinary contexts. In this paper, we present a new
publication model, which will be adopted for the ECML PKDD 2013 conference, and
aims to solve some of the problems of the traditional model. The key feature of
this model is the creation of a journal track, which is open to submissions all
year long and allows for revision cycles.Comment: 13 page
Experiment Databases: Creating a New Platform for Meta-Learning Research
Many studies in machine learning try to investigate what makes an algorithm succeed or fail on certain datasets. However, the field is still evolving relatively quickly, and new algorithms, preprocessing methods, learning tasks and evaluation procedures continue to emerge in the literature. Thus, it is impossible for a single study to cover this expanding space of learning approaches. In this paper, we propose a community-based approach for the analysis of learning algorithms, driven by sharing meta-data from previous experiments in a uniform way. We illustrate how organizing this information in a central database can create a practical public platform for any kind of exploitation of meta-knowledge, allowing effective reuse of previous experimentation and targeted analysis of the collected results
- …