38,823 research outputs found
Fitting Prediction Rule Ensembles with R Package pre
Prediction rule ensembles (PREs) are sparse collections of rules, offering
highly interpretable regression and classification models. This paper presents
the R package pre, which derives PREs through the methodology of Friedman and
Popescu (2008). The implementation and functionality of package pre is
described and illustrated through application on a dataset on the prediction of
depression. Furthermore, accuracy and sparsity of PREs is compared with that of
single trees, random forest and lasso regression in four benchmark datasets.
Results indicate that pre derives ensembles with predictive accuracy comparable
to that of random forests, while using a smaller number of variables for
prediction
Node harvest
When choosing a suitable technique for regression and classification with
multivariate predictor variables, one is often faced with a tradeoff between
interpretability and high predictive accuracy. To give a classical example,
classification and regression trees are easy to understand and interpret. Tree
ensembles like Random Forests provide usually more accurate predictions. Yet
tree ensembles are also more difficult to analyze than single trees and are
often criticized, perhaps unfairly, as `black box' predictors. Node harvest is
trying to reconcile the two aims of interpretability and predictive accuracy by
combining positive aspects of trees and tree ensembles. Results are very sparse
and interpretable and predictive accuracy is extremely competitive, especially
for low signal-to-noise data. The procedure is simple: an initial set of a few
thousand nodes is generated randomly. If a new observation falls into just a
single node, its prediction is the mean response of all training observation
within this node, identical to a tree-like prediction. A new observation falls
typically into several nodes and its prediction is then the weighted average of
the mean responses across all these nodes. The only role of node harvest is to
`pick' the right nodes from the initial large ensemble of nodes by choosing
node weights, which amounts in the proposed algorithm to a quadratic
programming problem with linear inequality constraints. The solution is sparse
in the sense that only very few nodes are selected with a nonzero weight. This
sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to
select a tuning parameter for optimal predictive accuracy. Node harvest can
handle mixed data and missing values and is shown to be simple to interpret and
competitive in predictive accuracy on a variety of data sets.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS367 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …