Search CORE

22,950 research outputs found

Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos

Author: He Dongliang
Huang Jizhou
Li Fu
Liu Xiao
Wen Shilei
Zhao Xiang
Publication venue
Publication date: 21/01/2019
Field of study

The task of video grounding, which temporally localizes a natural language description in a video, plays an important role in understanding videos. Existing studies have adopted strategies of sliding window over the entire video or exhaustively ranking all possible clip-sentence pairs in a pre-segmented video, which inevitably suffer from exhaustively enumerated candidates. To alleviate this problem, we formulate this task as a problem of sequential decision making by learning an agent which regulates the temporal grounding boundaries progressively based on its policy. Specifically, we propose a reinforcement learning based framework improved by multi-task learning and it shows steady performance gains by considering additional supervised boundary information during training. Our proposed framework achieves state-of-the-art performance on ActivityNet'18 DenseCaption dataset and Charades-STA dataset while observing only 10 or less clips per video.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

How active perception and attractor dynamics shape perceptual categorization: A computational model

Author: Akrami
Amit
Amit
Ballard
Barca
Barsalou
Barsalou
Barsalou
Barsalou
Barsalou
Barsalou
Bassett
Bickhard
Bogacz
Chao
Churchland
Cisek
Cisek
Clifford
Coles
Desmurget
Fagioli
Frintrop
Friston
Friston
Geisler
Gigliotta
Giovanni Pezzulo
Gold
Gold
Grush
Hayhoe
Hommel
Hopfield
Itti
Jean Charles Quinton
Jeannerod
Jeannerod
Kawato
Kietzmann
Kilner
Kokinov
Krajbich
Kruschke
Lamberts
Medin
Mirolli
Miyashita
Nelson
Nelson
Nicola Catenacci Volpi
Nolfi
Nosofsky
Olman
O’Regan
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Quinton
Quinton
Rao
Ratcliff
Ratcliff
Rehder
Resulaj
Rizzolatti
Rojas
Rosch
Rothkopf
Roy
Sakai
Salinas
Sanborn
Schoener
Shadlen
Song
Spivey
Strauss
Tipper
Tosoni
Trabasso
Tuci
Tucker
Usher
Wang
Wolpert
Yarbus
Publication venue: 'Elsevier BV'
Publication date: 23/07/2014
Field of study

We propose a computational model of perceptual categorization that fuses elements of grounded and sensorimotor theories of cognition with dynamic models of decision-making. We assume that category information consists in anticipated patterns of agent–environment interactions that can be elicited through overt or covert (simulated) eye movements, object manipulation, etc. This information is firstly encoded when category information is acquired, and then re-enacted during perceptual categorization. The perceptual categorization consists in a dynamic competition between attractors that encode the sensorimotor patterns typical of each category; action prediction success counts as ‘‘evidence’’ for a given category and contributes to falling into the corresponding attractor. The evidence accumulation process is guided by an active perception loop, and the active exploration of objects (e.g., visual exploration) aims at eliciting expected sensorimotor patterns that count as evidence for the object category. We present a computational model incorporating these elements and describing action prediction, active perception, and attractor dynamics as key elements of perceptual categorizations. We test the model in three simulated perceptual categorization tasks, and we discuss its relevance for grounded and sensorimotor theories of cognition.Peer reviewe

Crossref

HAL Clermont Université

University of Hertfordshire Research Archive