1 research outputs found
PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits
In sparse linear bandits, a learning agent sequentially selects an action and
receive reward feedback, and the reward function depends linearly on a few
coordinates of the covariates of the actions. This has applications in many
real-world sequential decision making problems. In this paper, we propose a
simple and computationally efficient sparse linear estimation method called
PopArt that enjoys a tighter recovery guarantee compared to Lasso
(Tibshirani, 1996) in many problems. Our bound naturally motivates an
experimental design criterion that is convex and thus computationally efficient
to solve. Based on our novel estimator and design criterion, we derive sparse
linear bandit algorithms that enjoy improved regret upper bounds upon the state
of the art (Hao et al., 2020), especially w.r.t. the geometry of the given
action set. Finally, we prove a matching lower bound for sparse linear bandits
in the data-poor regime, which closes the gap between upper and lower bounds in
prior work.Comment: 10 pages, 1 figures, to be published in 2022 Conference on Neural
Information Processing System