31,601 research outputs found

    BoostingTree: parallel selection of weak learners in boosting, with application to ranking

    Get PDF
    Boosting algorithms have been found successful in many areas of machine learning and, in particular, in ranking. For typical classes of weak learners used in boosting (such as decision stumps or trees), a large feature space can slow down the training, while a long sequence of weak hypotheses combined by boosting can result in a computationally expensive model. In this paper we propose a strategy that builds several sequences of weak hypotheses in parallel, and extends the ones that are likely to yield a good model. The weak hypothesis sequences are arranged in a boosting tree, and new weak hypotheses are added to promising nodes (both leaves and inner nodes) of the tree using some randomized method. Theoretical results show that the proposed algorithm asymptotically achieves the performance of the base boosting algorithm applied. Experiments are provided in ranking web documents and move ordering in chess, and the results indicate that the new strategy yields better performance when the length of the sequence is limited, and converges to similar performance as the original boosting algorithms otherwise. © 2013 The Author(s)

    Generation, Ranking and Unranking of Ordered Trees with Degree Bounds

    Full text link
    We study the problem of generating, ranking and unranking of unlabeled ordered trees whose nodes have maximum degree of Δ\Delta. This class of trees represents a generalization of chemical trees. A chemical tree is an unlabeled tree in which no node has degree greater than 4. By allowing up to Δ\Delta children for each node of chemical tree instead of 4, we will have a generalization of chemical trees. Here, we introduce a new encoding over an alphabet of size 4 for representing unlabeled ordered trees with maximum degree of Δ\Delta. We use this encoding for generating these trees in A-order with constant average time and O(n) worst case time. Due to the given encoding, with a precomputation of size and time O(n^2) (assuming Δ\Delta is constant), both ranking and unranking algorithms are also designed taking O(n) and O(nlogn) time complexities.Comment: In Proceedings DCM 2015, arXiv:1603.0053

    Runtime Optimizations for Prediction with Tree-Based Models

    Full text link
    Tree-based models have proven to be an effective solution for web ranking as well as other problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, given an already-trained model. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processor architectures. By laying out data structures in memory in a more cache-conscious fashion, removing branches from the execution flow using a technique called predication, and micro-batching predictions using a technique called vectorization, we are able to better exploit modern processor architectures and significantly improve the speed of tree-based models over hard-coded if-else blocks. Our work contributes to the exploration of architecture-conscious runtime implementations of machine learning algorithms
    • …
    corecore