1,880 research outputs found
A Bayes\u27 Theorem Based Approach for the Selection of Best Pruned Tree
Decision tree pruning is critical for the construction of good decision trees. The most popular and widely used method among various pruning methods is cost-complexity pruning, whose implementation requires a training dataset to develop a full tree and a validation dataset to prune the tree. However, different pruned trees are found to be produced when the original dataset are randomly partitioned into different training and validation datasets. Which pruned tree is the best? This paper presents an approach derived from Bayes’ theorem to select the best pruned tree from a group of pruned trees produced by costcomplexity pruning method. The results of an experimental study indicate that the proposed approach works satisfactorily to find the best pruned tree in terms of classification accuracy and performance stability
WARP: Wavelets with adaptive recursive partitioning for multi-dimensional data
Effective identification of asymmetric and local features in images and other
data observed on multi-dimensional grids plays a critical role in a wide range
of applications including biomedical and natural image processing. Moreover,
the ever increasing amount of image data, in terms of both the resolution per
image and the number of images processed per application, requires algorithms
and methods for such applications to be computationally efficient. We develop a
new probabilistic framework for multi-dimensional data to overcome these
challenges through incorporating data adaptivity into discrete wavelet
transforms, thereby allowing them to adapt to the geometric structure of the
data while maintaining the linear computational scalability. By exploiting a
connection between the local directionality of wavelet transforms and recursive
dyadic partitioning on the grid points of the observation, we obtain the
desired adaptivity through adding to the traditional Bayesian wavelet
regression framework an additional layer of Bayesian modeling on the space of
recursive partitions over the grid points. We derive the corresponding
inference recipe in the form of a recursive representation of the exact
posterior, and develop a class of efficient recursive message passing
algorithms for achieving exact Bayesian inference with a computational
complexity linear in the resolution and sample size of the images. While our
framework is applicable to a range of problems including multi-dimensional
signal processing, compression, and structural learning, we illustrate its work
and evaluate its performance in the context of 2D and 3D image reconstruction
using real images from the ImageNet database. We also apply the framework to
analyze a data set from retinal optical coherence tomography
Learning All Credible Bayesian Network Structures for Model Averaging
A Bayesian network is a widely used probabilistic graphical model with
applications in knowledge discovery and prediction. Learning a Bayesian network
(BN) from data can be cast as an optimization problem using the well-known
score-and-search approach. However, selecting a single model (i.e., the best
scoring BN) can be misleading or may not achieve the best possible accuracy. An
alternative to committing to a single model is to perform some form of Bayesian
or frequentist model averaging, where the space of possible BNs is sampled or
enumerated in some fashion. Unfortunately, existing approaches for model
averaging either severely restrict the structure of the Bayesian network or
have only been shown to scale to networks with fewer than 30 random variables.
In this paper, we propose a novel approach to model averaging inspired by
performance guarantees in approximation algorithms. Our approach has two
primary advantages. First, our approach only considers credible models in that
they are optimal or near-optimal in score. Second, our approach is more
efficient and scales to significantly larger Bayesian networks than existing
approaches.Comment: under review by JMLR. arXiv admin note: substantial text overlap with
arXiv:1811.0503
Learning Taxonomy Adaptation in Large-scale Classification
International audienc
- …