2 research outputs found

    SWAG: A Wrapper Method for Sparse Learning

    Full text link
    The majority of machine learning methods and algorithms give high priority to prediction performance which may not always correspond to the priority of the users. In many cases, practitioners and researchers in different fields, going from engineering to genetics, require interpretability and replicability of the results especially in settings where, for example, not all attributes may be available to them. As a consequence, there is the need to make the outputs of machine learning algorithms more interpretable and to deliver a library of "equivalent" learners (in terms of prediction performance) that users can select based on attribute availability in order to test and/or make use of these learners for predictive/diagnostic purposes. To address these needs, we propose to study a procedure that combines screening and wrapper approaches which, based on a user-specified learning method, greedily explores the attribute space to find a library of sparse learners with consequent low data collection and storage costs. This new method (i) delivers a low-dimensional network of attributes that can be easily interpreted and (ii) increases the potential replicability of results based on the diversity of attribute combinations defining strong learners with equivalent predictive power. We call this algorithm "Sparse Wrapper AlGorithm" (SWAG)

    Estimation and Inference with Trees and Forests in High Dimensions

    Full text link
    We analyze the finite sample mean squared error (MSE) performance of regression trees and forests in the high dimensional regime with binary features, under a sparsity constraint. We prove that if only rr of the dd features are relevant for the mean outcome function, then shallow trees built greedily via the CART empirical MSE criterion achieve MSE rates that depend only logarithmically on the ambient dimension dd. We prove upper bounds, whose exact dependence on the number relevant variables rr depends on the correlation among the features and on the degree of relevance. For strongly relevant features, we also show that fully grown honest forests achieve fast MSE rates and their predictions are also asymptotically normal, enabling asymptotically valid inference that adapts to the sparsity of the regression function.Comment: Accepted for presentation at the Conference on Learning Theory (COLT) 202
    corecore