12,788 research outputs found
Action Recognition from Single Timestamp Supervision in Untrimmed Videos
Recognising actions in videos relies on labelled supervision during training,
typically the start and end times of each action instance. This supervision is
not only subjective, but also expensive to acquire. Weak video-level
supervision has been successfully exploited for recognition in untrimmed
videos, however it is challenged when the number of different actions in
training videos increases. We propose a method that is supervised by single
timestamps located around each action instance, in untrimmed videos. We replace
expensive action bounds with sampling distributions initialised from these
timestamps. We then use the classifier's response to iteratively update the
sampling distributions. We demonstrate that these distributions converge to the
location and extent of discriminative action segments. We evaluate our method
on three datasets for fine-grained recognition, with increasing number of
different actions per video, and show that single timestamps offer a reasonable
compromise between recognition performance and labelling effort, performing
comparably to full temporal supervision. Our update method improves top-1 test
accuracy by up to 5.4%. across the evaluated datasets.Comment: CVPR 201
ViZDoom Competitions: Playing Doom from Pixels
This paper presents the first two editions of Visual Doom AI Competition,
held in 2016 and 2017. The challenge was to create bots that compete in a
multi-player deathmatch in a first-person shooter (FPS) game, Doom. The bots
had to make their decisions based solely on visual information, i.e., a raw
screen buffer. To play well, the bots needed to understand their surroundings,
navigate, explore, and handle the opponents at the same time. These aspects,
together with the competitive multi-agent aspect of the game, make the
competition a unique platform for evaluating the state of the art reinforcement
learning algorithms. The paper discusses the rules, solutions, results, and
statistics that give insight into the agents' behaviors. Best-performing agents
are described in more detail. The results of the competition lead to the
conclusion that, although reinforcement learning can produce capable Doom bots,
they still are not yet able to successfully compete against humans in this
game. The paper also revisits the ViZDoom environment, which is a flexible,
easy to use, and efficient 3D platform for research for vision-based
reinforcement learning, based on a well-recognized first-person perspective
game Doom
- …