3,890 research outputs found
Near-Optimal BRL using Optimistic Local Transitions
Model-based Bayesian Reinforcement Learning (BRL) allows a found
formalization of the problem of acting optimally while facing an unknown
environment, i.e., avoiding the exploration-exploitation dilemma. However,
algorithms explicitly addressing BRL suffer from such a combinatorial explosion
that a large body of work relies on heuristic algorithms. This paper introduces
BOLT, a simple and (almost) deterministic heuristic algorithm for BRL which is
optimistic about the transition function. We analyze BOLT's sample complexity,
and show that under certain parameters, the algorithm is near-optimal in the
Bayesian sense with high probability. Then, experimental results highlight the
key differences of this method compared to previous work.Comment: ICML201
Difference of Convex Functions Programming Applied to Control with Expert Data
This paper reports applications of Difference of Convex functions (DC)
programming to Learning from Demonstrations (LfD) and Reinforcement Learning
(RL) with expert data. This is made possible because the norm of the Optimal
Bellman Residual (OBR), which is at the heart of many RL and LfD algorithms, is
DC. Improvement in performance is demonstrated on two specific algorithms,
namely Reward-regularized Classification for Apprenticeship Learning (RCAL) and
Reinforcement Learning with Expert Demonstrations (RLED), through experiments
on generic Markov Decision Processes (MDP), called Garnets
- …