20,856 research outputs found
Using all data to generate decision tree ensembles
Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. G. Martinez-Munoz, A. Suarez, "Using all data to generate decision tree ensembles", in IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 34, 4 (2004), p. 393-397This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost
Generating Compact Tree Ensembles via Annealing
Tree ensembles are flexible predictive models that can capture relevant
variables and to some extent their interactions in a compact and interpretable
manner. Most algorithms for obtaining tree ensembles are based on versions of
boosting or Random Forest. Previous work showed that boosting algorithms
exhibit a cyclic behavior of selecting the same tree again and again due to the
way the loss is optimized. At the same time, Random Forest is not based on loss
optimization and obtains a more complex and less interpretable model. In this
paper we present a novel method for obtaining compact tree ensembles by growing
a large pool of trees in parallel with many independent boosting threads and
then selecting a small subset and updating their leaf weights by loss
optimization. We allow for the trees in the initial pool to have different
depths which further helps with generalization. Experiments on real datasets
show that the obtained model has usually a smaller loss than boosting, which is
also reflected in a lower misclassification error on the test set.Comment: Comparison with Random Forest included in the results sectio
Popular Ensemble Methods: An Empirical Study
An ensemble consists of a set of individually trained classifiers (such as
neural networks or decision trees) whose predictions are combined when
classifying novel instances. Previous research has shown that an ensemble is
often more accurate than any of the single classifiers in the ensemble. Bagging
(Breiman, 1996c) and Boosting (Freund and Shapire, 1996; Shapire, 1990) are two
relatively new but popular methods for producing ensembles. In this paper we
evaluate these methods on 23 data sets using both neural networks and decision
trees as our classification algorithm. Our results clearly indicate a number of
conclusions. First, while Bagging is almost always more accurate than a single
classifier, it is sometimes much less accurate than Boosting. On the other
hand, Boosting can create ensembles that are less accurate than a single
classifier -- especially when using neural networks. Analysis indicates that
the performance of the Boosting methods is dependent on the characteristics of
the data set being examined. In fact, further results show that Boosting
ensembles may overfit noisy data sets, thus decreasing its performance.
Finally, consistent with previous studies, our work suggests that most of the
gain in an ensemble's performance comes in the first few classifiers combined;
however, relatively large gains can be seen up to 25 classifiers when Boosting
decision trees
Non-uniform Feature Sampling for Decision Tree Ensembles
We study the effectiveness of non-uniform randomized feature selection in
decision tree classification. We experimentally evaluate two feature selection
methodologies, based on information extracted from the provided dataset:
\emph{leverage scores-based} and \emph{norm-based} feature selection.
Experimental evaluation of the proposed feature selection techniques indicate
that such approaches might be more effective compared to naive uniform feature
selection and moreover having comparable performance to the random forest
algorithm [3]Comment: 7 pages, 7 figures, 1 tabl
Runtime Optimizations for Prediction with Tree-Based Models
Tree-based models have proven to be an effective solution for web ranking as
well as other problems in diverse domains. This paper focuses on optimizing the
runtime performance of applying such models to make predictions, given an
already-trained model. Although exceedingly simple conceptually, most
implementations of tree-based models do not efficiently utilize modern
superscalar processor architectures. By laying out data structures in memory in
a more cache-conscious fashion, removing branches from the execution flow using
a technique called predication, and micro-batching predictions using a
technique called vectorization, we are able to better exploit modern processor
architectures and significantly improve the speed of tree-based models over
hard-coded if-else blocks. Our work contributes to the exploration of
architecture-conscious runtime implementations of machine learning algorithms
COMET: A Recipe for Learning and Using Large Ensembles on Massive Data
COMET is a single-pass MapReduce algorithm for learning on large-scale data.
It builds multiple random forest ensembles on distributed blocks of data and
merges them into a mega-ensemble. This approach is appropriate when learning
from massive-scale data that is too large to fit on a single machine. To get
the best accuracy, IVoting should be used instead of bagging to generate the
training subset for each decision tree in the random forest. Experiments with
two large datasets (5GB and 50GB compressed) show that COMET compares favorably
(in both accuracy and training time) to learning on a subsample of data using a
serial algorithm. Finally, we propose a new Gaussian approach for lazy ensemble
evaluation which dynamically decides how many ensemble members to evaluate per
data point; this can reduce evaluation cost by 100X or more
- …