Search CORE

5,879 research outputs found

Object Referring in Videos with Language and Human Gaze

Author: Dai Dengxin
Van Gool Luc
Vasudevan Arun Balajee
Publication venue
Publication date: 04/04/2018
Field of study

We investigate the problem of object referring (OR) i.e. to localize a target object in a visual scene coming with a language description. Humans perceive the world more as continued video snippets than as static images, and describe objects not only by their appearance, but also by their spatio-temporal context and motion features. Humans also gaze at the object when they issue a referring expression. Existing works for OR mostly focus on static images only, which fall short in providing many such cues. This paper addresses OR in videos with language and human gaze. To that end, we present a new video dataset for OR, with 30, 000 objects over 5, 000 stereo video sequences annotated for their descriptions and gaze. We further propose a novel network model for OR in videos, by integrating appearance, motion, gaze, and spatio-temporal context into one network. Experimental results show that our method effectively utilizes motion cues, human gaze, and spatio-temporal context. Our method outperforms previousOR methods. For dataset and code, please refer https://people.ee.ethz.ch/~arunv/ORGaze.html.Comment: Accepted to CVPR 2018, 10 pages, 6 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Analysis of prepositions: near and away from Frames of reference.

Author: Flor Fabregat Nuria
Publication venue: 'Universitat Jaume I'
Publication date: 01/01/2017
Field of study

XXII Jornades de Foment de la Investigació de la Facultat de Ciències Humanes i Socials (Any 2017)Traditional strategies and procedures to learn a foreign language include the study of rules of grammar and doing exercises such as filling the gaps, repetition of words, drills, memorization of irregular verbs and sentences which may express usual expressions of everyday life. Even if the array of exercises is adequate, polysemy in prepositions causes difficulties in choosing the proper preposition conveying the meaning required by different contexts. Two prepositions of the horizontal axis (near and away from) are taken into consideration in this paper. Approaching the problem from the theory of polysemy and understanding, the use of these prepositions is explored along the dimensions of function, topology – which is the study of physical space–, and force dynamics – introduced in studies such as Navarro (1998)–, as well as the notion of frame of reference (Levinson, 2004). Then, the different senses and uses of these prepositions of the horizontal axis are systematized, explained and examples are used to illustrate the difficulties in learning a language and the doubts which students may have in some situations

Repositori Institucional de la Universitat Jaume I

Facial Feature Tracking and Occlusion Recovery in American Sign Language

Author: Castelli Thomas J.
Betke Margrit
Neidle Carol
Publication venue: Boston University Computer Science Department
Publication date: 01/01/1997
Field of study

Facial features play an important role in expressing grammatical information in signed languages, including American Sign Language(ASL). Gestures such as raising or furrowing the eyebrows are key indicators of constructions such as yes-no questions. Periodic head movements (nods and shakes) are also an essential part of the expression of syntactic information, such as negation (associated with a side-to-side headshake). Therefore, identification of these facial gestures is essential to sign language recognition. One problem with detection of such grammatical indicators is occlusion recovery. If the signer's hand blocks his/her eyebrows during production of a sign, it becomes difficult to track the eyebrows. We have developed a system to detect such grammatical markers in ASL that recovers promptly from occlusion. Our system detects and tracks evolving templates of facial features, which are based on an anthropometric face model, and interprets the geometric relationships of these templates to identify grammatical markers. It was tested on a variety of ASL sentences signed by various Deaf native signers and detected facial gestures used to express grammatical information, such as raised and furrowed eyebrows as well as headshakes.National Science Foundation (IIS-0329009, IIS-0093367, IIS-9912573, EIA-0202067, EIA-9809340

Boston University Institutional Repository (OpenBU)

A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

Author: Antonakos Epameinondas
Asthana Akshay
Chrysos Grigorios G.
Snape Patrick
Zafeiriou Stefanos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2017
Field of study

Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

arXiv.org e-Print Archive

Springer - Publisher Connector

Spiral - Imperial College Digital Repository