581 research outputs found
Differential Good Arm Identification
This paper targets a variant of the stochastic multi-armed bandit problem
called good arm identification (GAI). GAI is a pure-exploration bandit problem
with the goal to output as many good arms using as few samples as possible,
where a good arm is defined as an arm whose expected reward is greater than a
given threshold. In this work, we propose DGAI - a differentiable good arm
identification algorithm to improve the sample complexity of the
state-of-the-art HDoC algorithm in a data-driven fashion. We also showed that
the DGAI can further boost the performance of a general multi-arm bandit (MAB)
problem given a threshold as a prior knowledge to the arm set. Extensive
experiments confirm that our algorithm outperform the baseline algorithms
significantly in both synthetic and real world datasets for both GAI and MAB
tasks
LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem
In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a
nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on
maximizing decision accuracy against a linearly defined threshold under
resource constraints. We present LinearAPT, a novel algorithm designed for the
fixed budget setting of TLB, providing an efficient solution to optimize
sequential decision-making. This algorithm not only offers a theoretical upper
bound for estimated loss but also showcases robust performance on both
synthetic and real-world datasets. Our contributions highlight the
adaptability, simplicity, and computational efficiency of LinearAPT, making it
a valuable addition to the toolkit for addressing complex sequential
decision-making challenges
lil'HDoC: An Algorithm for Good Arm Identification under Small Threshold Gap
Good arm identification (GAI) is a pure-exploration bandit problem in which a
single learner outputs an arm as soon as it is identified as a good arm. A good
arm is defined as an arm with an expected reward greater than or equal to a
given threshold. This paper focuses on the GAI problem under a small threshold
gap, which refers to the distance between the expected rewards of arms and the
given threshold. We propose a new algorithm called lil'HDoC to significantly
improve the total sample complexity of the HDoC algorithm. We demonstrate that
the sample complexity of the first output arm in lil'HDoC is bounded
by the original HDoC algorithm, except for one negligible term, when the
distance between the expected reward and threshold is small. Extensive
experiments confirm that our algorithm outperforms the state-of-the-art
algorithms in both synthetic and real-world datasets
Exposing the Functionalities of Neurons for Gated Recurrent Unit Based Sequence-to-Sequence Model
The goal of this paper is to report certain scientific discoveries about a
Seq2Seq model. It is known that analyzing the behavior of RNN-based models at
the neuron level is considered a more challenging task than analyzing a DNN or
CNN models due to their recursive mechanism in nature. This paper aims to
provide neuron-level analysis to explain why a vanilla GRU-based Seq2Seq model
without attention can achieve token-positioning. We found four different types
of neurons: storing, counting, triggering, and outputting and further uncover
the mechanism for these neurons to work together in order to produce the right
token in the right position.Comment: 9 pages (excluding reference), 10 figure
- …