46,731 research outputs found
Selective Sampling with Drift
Recently there has been much work on selective sampling, an online active
learning setting, in which algorithms work in rounds. On each round an
algorithm receives an input and makes a prediction. Then, it can decide whether
to query a label, and if so to update its model, otherwise the input is
discarded. Most of this work is focused on the stationary case, where it is
assumed that there is a fixed target model, and the performance of the
algorithm is compared to a fixed model. However, in many real-world
applications, such as spam prediction, the best target function may drift over
time, or have shifts from time to time. We develop a novel selective sampling
algorithm for the drifting setting, analyze it under no assumptions on the
mechanism generating the sequence of instances, and derive new mistake bounds
that depend on the amount of drift in the problem. Simulations on synthetic and
real-world datasets demonstrate the superiority of our algorithms as a
selective sampling algorithm in the drifting setting
Thompson Sampling in Dynamic Systems for Contextual Bandit Problems
We consider the multiarm bandit problems in the timevarying dynamic system
for rich structural features. For the nonlinear dynamic model, we propose the
approximate inference for the posterior distributions based on Laplace
Approximation. For the context bandit problems, Thompson Sampling is adopted
based on the underlying posterior distributions of the parameters. More
specifically, we introduce the discount decays on the previous samples impact
and analyze the different decay rates with the underlying sample dynamics.
Consequently, the exploration and exploitation is adaptively tradeoff according
to the dynamics in the system.Comment: 22 pages, 10 figure
A consistent deterministic regression tree for non-parametric prediction of time series
We study online prediction of bounded stationary ergodic processes. To do so,
we consider the setting of prediction of individual sequences and build a
deterministic regression tree that performs asymptotically as well as the best
L-Lipschitz constant predictors. Then, we show why the obtained regret bound
entails the asymptotical optimality with respect to the class of bounded
stationary ergodic processes
- …