325,998 research outputs found
Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees
Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees,
have been successfully used for regression in many applications and research
studies. Furthermore, these methods have been extended in order to deal with
uncertainty in the output variable, using for example a quantile loss in Random
Forests (Meinshausen, 2006). To the best of our knowledge, no extension has
been provided yet for dealing with uncertainties in the input variables, even
though such uncertainties are common in practical situations. We propose here
such an extension by showing how standard regression trees optimizing a
quadratic loss can be adapted and learned while taking into account the
uncertainties in the inputs. By doing so, one no longer assumes that an
observation lies into a single region of the regression tree, but rather that
it belongs to each region with a certain probability. Experiments conducted on
several data sets illustrate the good behavior of the proposed extension.Comment: 9 page
Alternating model trees
Model tree induction is a popular method for tackling regression problems requiring interpretable models. Model trees are decision trees with multiple linear regression models at the leaf nodes. In this paper, we propose a method for growing alternating model trees, a form of option tree for regression problems. The motivation is that alternating decision trees achieve high accuracy in classification problems because they represent an ensemble classifier as a single tree structure. As in alternating decision trees for classifi-cation, our alternating model trees for regression contain splitter and prediction nodes, but we use simple linear regression functions as opposed to constant predictors at the prediction nodes. Moreover, additive regression using forward stagewise modeling is applied to grow the tree rather than a boosting algorithm. The size of the tree is determined using cross-validation. Our empirical results show that alternating model trees achieve significantly lower squared error than standard model trees on several regression datasets
Particle Gibbs for Bayesian Additive Regression Trees
Additive regression trees are flexible non-parametric models and popular
off-the-shelf tools for real-world non-linear regression. In application
domains, such as bioinformatics, where there is also demand for probabilistic
predictions with measures of uncertainty, the Bayesian additive regression
trees (BART) model, introduced by Chipman et al. (2010), is increasingly
popular. As data sets have grown in size, however, the standard
Metropolis-Hastings algorithms used to perform inference in BART are proving
inadequate. In particular, these Markov chains make local changes to the trees
and suffer from slow mixing when the data are high-dimensional or the best
fitting trees are more than a few layers deep. We present a novel sampler for
BART based on the Particle Gibbs (PG) algorithm (Andrieu et al., 2010) and a
top-down particle filtering algorithm for Bayesian decision trees
(Lakshminarayanan et al., 2013). Rather than making local changes to individual
trees, the PG sampler proposes a complete tree to fit the residual. Experiments
show that the PG sampler outperforms existing samplers in many settings
- …
