2 research outputs found
SWAG: A Wrapper Method for Sparse Learning
The majority of machine learning methods and algorithms give high priority to
prediction performance which may not always correspond to the priority of the
users. In many cases, practitioners and researchers in different fields, going
from engineering to genetics, require interpretability and replicability of the
results especially in settings where, for example, not all attributes may be
available to them. As a consequence, there is the need to make the outputs of
machine learning algorithms more interpretable and to deliver a library of
"equivalent" learners (in terms of prediction performance) that users can
select based on attribute availability in order to test and/or make use of
these learners for predictive/diagnostic purposes. To address these needs, we
propose to study a procedure that combines screening and wrapper approaches
which, based on a user-specified learning method, greedily explores the
attribute space to find a library of sparse learners with consequent low data
collection and storage costs. This new method (i) delivers a low-dimensional
network of attributes that can be easily interpreted and (ii) increases the
potential replicability of results based on the diversity of attribute
combinations defining strong learners with equivalent predictive power. We call
this algorithm "Sparse Wrapper AlGorithm" (SWAG)
Estimation and Inference with Trees and Forests in High Dimensions
We analyze the finite sample mean squared error (MSE) performance of
regression trees and forests in the high dimensional regime with binary
features, under a sparsity constraint. We prove that if only of the
features are relevant for the mean outcome function, then shallow trees built
greedily via the CART empirical MSE criterion achieve MSE rates that depend
only logarithmically on the ambient dimension . We prove upper bounds, whose
exact dependence on the number relevant variables depends on the
correlation among the features and on the degree of relevance. For strongly
relevant features, we also show that fully grown honest forests achieve fast
MSE rates and their predictions are also asymptotically normal, enabling
asymptotically valid inference that adapts to the sparsity of the regression
function.Comment: Accepted for presentation at the Conference on Learning Theory (COLT)
202