32,232 research outputs found
Reinforced Decision Trees
In order to speed-up classification models when facing a large number of
categories, one usual approach consists in organizing the categories in a
particular structure, this structure being then used as a way to speed-up the
prediction computation. This is for example the case when using
error-correcting codes or even hierarchies of categories. But in the majority
of approaches, this structure is chosen \textit{by hand}, or during a
preliminary step, and not integrated in the learning process. We propose a new
model called Reinforced Decision Tree which simultaneously learns how to
organize categories in a tree structure and how to classify any input based on
this structure. This approach keeps the advantages of existing techniques (low
inference complexity) but allows one to build efficient classifiers in one
learning step. The learning algorithm is inspired by reinforcement learning and
policy-gradient techniques which allows us to integrate the two steps (building
the tree, and learning the classifier) in one single algorithm
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
The task of video grounding, which temporally localizes a natural language
description in a video, plays an important role in understanding videos.
Existing studies have adopted strategies of sliding window over the entire
video or exhaustively ranking all possible clip-sentence pairs in a
pre-segmented video, which inevitably suffer from exhaustively enumerated
candidates. To alleviate this problem, we formulate this task as a problem of
sequential decision making by learning an agent which regulates the temporal
grounding boundaries progressively based on its policy. Specifically, we
propose a reinforcement learning based framework improved by multi-task
learning and it shows steady performance gains by considering additional
supervised boundary information during training. Our proposed framework
achieves state-of-the-art performance on ActivityNet'18 DenseCaption dataset
and Charades-STA dataset while observing only 10 or less clips per video.Comment: AAAI 201
- …