774 research outputs found
Stochastic Linear Bandits with Hidden Low Rank Structure
High-dimensional representations often have a lower dimensional underlying
structure. This is particularly the case in many decision making settings. For
example, when the representation of actions is generated from a deep neural
network, it is reasonable to expect a low-rank structure whereas conventional
structures like sparsity are not valid anymore. Subspace recovery methods, such
as Principle Component Analysis (PCA) can find the underlying low-rank
structures in the feature space and reduce the complexity of the learning
tasks. In this work, we propose Projected Stochastic Linear Bandit (PSLB), an
algorithm for high dimensional stochastic linear bandits (SLB) when the
representation of actions has an underlying low-dimensional subspace structure.
PSLB deploys PCA based projection to iteratively find the low rank structure in
SLBs. We show that deploying projection methods assures dimensionality
reduction and results in a tighter regret upper bound that is in terms of the
dimensionality of the subspace and its properties, rather than the
dimensionality of the ambient space. We modify the image classification task
into the SLB setting and empirically show that, when a pre-trained DNN provides
the high dimensional feature representations, deploying PSLB results in
significant reduction of regret and faster convergence to an accurate model
compared to state-of-art algorithm
Stochastic Linear Bandits with Hidden Low Rank Structure
High-dimensional representations often have a lower dimensional underlying structure. This is particularly the case in many decision making settings. For example, when the representation of actions is generated from a deep neural network, it is reasonable to expect a low-rank structure whereas conventional structures like sparsity are not valid anymore. Subspace recovery methods, such as Principle Component Analysis (PCA) can find the underlying low-rank structures in the feature space and reduce the complexity of the learning tasks. In this work, we propose Projected Stochastic Linear Bandit (PSLB), an algorithm for high dimensional stochastic linear bandits (SLB) when the representation of actions has an underlying low-dimensional subspace structure. PSLB deploys PCA based projection to iteratively find the low rank structure in SLBs. We show that deploying projection methods assures dimensionality reduction and results in a tighter regret upper bound that is in terms of the dimensionality of the subspace and its properties, rather than the dimensionality of the ambient space. We modify the image classification task into the SLB setting and empirically show that, when a pre-trained DNN provides the high dimensional feature representations, deploying PSLB results in significant reduction of regret and faster convergence to an accurate model compared to state-of-art algorithm
Linear Bandits with Feature Feedback
This paper explores a new form of the linear bandit problem in which the
algorithm receives the usual stochastic rewards as well as stochastic feedback
about which features are relevant to the rewards, the latter feedback being the
novel aspect. The focus of this paper is the development of new theory and
algorithms for linear bandits with feature feedback. We show that linear
bandits with feature feedback can achieve regret over time horizon that
scales like , without prior knowledge of which features are relevant
nor the number of relevant features. In comparison, the regret of
traditional linear bandits is , where is the total number of
(relevant and irrelevant) features, so the improvement can be dramatic if . The computational complexity of the new algorithm is proportional to
rather than , making it much more suitable for real-world applications
compared to traditional linear bandits. We demonstrate the performance of the
new algorithm with synthetic and real human-labeled data
- β¦