Search CORE

154,592 research outputs found

Towards a unified framework for hand-based methods in First Person Vision

Author: Barakova Emilia
Betancourt Alejandro Arango
Marcenaro Lucio
Morerio Pietro
Rauterberg Matthias
Regazzoni Carlo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

First Person Vision (Egocentric) video analysis stands nowadays as one of the emerging fields in computer vision. The availability of wearable devices recording exactly what the user is looking at is ineluctable and the opportunities and challenges carried by this kind of devices are broad. Particularly, for the first time a device is so intimate with the user to be able to record the movements of his hands, making hand-based applications for First Person Vision one the most explored area in the field. This paper explores the more popular processing steps to develop hand-based applications, and proposes a hierarchical structure that optimally switches between each of the levels to reduce the computational cost of the system and improve its performance

Repository TU/e

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Genova

Left/Right Hand Segmentation in Egocentric Videos

Author: Barakova Emilia
Betancourt Alejandro
Marcenaro Lucio
Morerio Pietro
Rauterberg Matthias
Regazzoni Carlo
Publication venue: 'Elsevier BV'
Publication date: 21/07/2016
Field of study

Wearable cameras allow people to record their daily activities from a user-centered (First Person Vision) perspective. Due to their favorable location, wearable cameras frequently capture the hands of the user, and may thus represent a promising user-machine interaction tool for different applications. Existent First Person Vision methods handle hand segmentation as a background-foreground problem, ignoring two important facts: i) hands are not a single "skin-like" moving element, but a pair of interacting cooperative entities, ii) close hand interactions may lead to hand-to-hand occlusions and, as a consequence, create a single hand-like segment. These facts complicate a proper understanding of hand movements and interactions. Our approach extends traditional background-foreground strategies, by including a hand-identification step (left-right) based on a Maxwell distribution of angle and position. Hand-to-hand occlusions are addressed by exploiting temporal superpixels. The experimental results show that, in addition to a reliable left/right hand-segmentation, our approach considerably improves the traditional background-foreground hand-segmentation

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Genova

Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries

Author: Hengel Anton van den
Reid Ian
Shen Chunhua
Wu Qi
Zhuang Bohan
Publication venue
Publication date: 16/11/2017
Field of study

Recognising objects according to a pre-defined fixed set of class labels has been well studied in the Computer Vision. There are a great many practical applications where the subjects that may be of interest are not known beforehand, or so easily delineated, however. In many of these cases natural language dialog is a natural way to specify the subject of interest, and the task achieving this capability (a.k.a, Referring Expression Comprehension) has recently attracted attention. To this end we propose a unified framework, the ParalleL AttentioN (PLAN) network, to discover the object in an image that is being referred to in variable length natural expression descriptions, from short phrases query to long multi-round dialogs. The PLAN network has two attention mechanisms that relate parts of the expressions to both the global visual content and also directly to object candidates. Furthermore, the attention mechanisms are recurrent, making the referring process visualizable and explainable. The attended information from these dual sources are combined to reason about the referred object. These two attention mechanisms can be trained in parallel and we find the combined system outperforms the state-of-art on several benchmarked datasets with different length language input, such as RefCOCO, RefCOCO+ and GuessWhat?!.Comment: 11 page

arXiv.org e-Print Archive

Adelaide Research & Scholarship