5,186 research outputs found
Recommended from our members
Automatic parsing of sports videos with grammars
Motivated by the analogies between languages and sports videos, we introduce a novel
approach for video parsing with grammars. It utilizes compiler techniques for integrating both semantic
annotation and syntactic analysis to generate a semantic index of events and a table of content for a given
sports video. The video sequence is first segmented and annotated by event detection with domain
knowledge. A grammar-based parser is then used to identify the structure of the video content.
Meanwhile, facilities for error handling are introduced which are particularly useful when the results of
automatic parsing need to be adjusted. As a case study, we have developed a system for video parsing in
the particular domain of TV diving programs. Experimental results indicate the proposed approach is
effectiv
On-line processing of English which-questions by children and adults: a visual world paradigm study
Previous research has shown that children demonstrate similar sentence processing reflexes to those observed in adults, but they have difficulties revising an erroneous initial interpretation when they process garden-path sentences, passives, and wh -questions. We used the visual-world paradigm to examine children's use of syntactic and non-syntactic information to resolve syntactic ambiguity by extending our understanding of number features as a cue for interpretation to which -subject and which -object questions. We compared children's and adults’ eye-movements to understand how this information shapes children's commitment to and revision of possible interpretations of these questions. The results showed that English-speaking adults and children both exhibit an initial preference to interpret an object- which question as a subject question. While adults quickly override this preference, children take significantly longer, showing an overall processing difficulty for object questions. Crucially, their recovery from an initially erroneous interpretation is speeded when disambiguating number agreement features are present
ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation
We propose a structured prediction architecture, which exploits the local
generic features extracted by Convolutional Neural Networks and the capacity of
Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed
architecture, called ReSeg, is based on the recently introduced ReNet model for
image classification. We modify and extend it to perform the more challenging
task of semantic segmentation. Each ReNet layer is composed of four RNN that
sweep the image horizontally and vertically in both directions, encoding
patches or activations, and providing relevant global information. Moreover,
ReNet layers are stacked on top of pre-trained convolutional layers, benefiting
from generic local features. Upsampling layers follow ReNet layers to recover
the original image resolution in the final predictions. The proposed ReSeg
architecture is efficient, flexible and suitable for a variety of semantic
segmentation tasks. We evaluate ReSeg on several widely-used semantic
segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving
state-of-the-art performance. Results show that ReSeg can act as a suitable
architecture for semantic segmentation tasks, and may have further applications
in other structured prediction problems. The source code and model
hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201
- …