6 research outputs found
Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection
We consider a contextual online learning (multi-armed bandit) problem with
high-dimensional covariate and decision . The reward
function to learn, , does not have a particular
parametric form. The literature has shown that the optimal regret is
, where and are the
dimensions of and , and thus it suffers from the curse
of dimensionality. In many applications, only a small subset of variables in
the covariate affect the value of , which is referred to as
\textit{sparsity} in statistics. To take advantage of the sparsity structure of
the covariate, we propose a variable selection algorithm called
\textit{BV-LASSO}, which incorporates novel ideas such as binning and voting to
apply LASSO to nonparametric settings. Our algorithm achieves the regret
, where is the effective
covariate dimension. The regret matches the optimal regret when the covariate
is -dimensional and thus cannot be improved. Our algorithm may serve as
a general recipe to achieve dimension reduction via variable selection in
nonparametric settings