Search CORE

2,809 research outputs found

Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

Author: Hipiny I.
Juan S. F. Samson
Khairuddin M. A.
Minoi J. L.
Sunar M. S.
Ujir H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/09/2017
Field of study

Unsupervised segmentation of action segments in egocentric videos is a desirable feature in tasks such as activity recognition and content-based video retrieval. Reducing the search space into a finite set of action segments facilitates a faster and less noisy matching. However, there exist a substantial gap in machine understanding of natural temporal cuts during a continuous human activity. This work reports on a novel gaze-based approach for segmenting action segments in videos captured using an egocentric camera. Gaze is used to locate the region-of-interest inside a frame. By tracking two simple motion-based parameters inside successive regions-of-interest, we discover a finite set of temporal cuts. We present several results using combinations (of the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains egocentric videos depicting several daily-living activities. The quality of the temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image Processing Application

arXiv.org e-Print Archive

Crossref

Recommended from our members

An investigation of visual cues used to create and support frames of reference and visual search tasks in desktop virtual environments

Author: Cribbin T
Macredie RD
Morar SS
Publication venue: Springer London
Publication date: 01/10/2002
Field of study

Visual depth cues are combined to produce the essential depth and dimensionality of Desktop Virtual Environments (DVEs). This study discusses DVEs in terms of the visual depth cues that create and support perception of frames of references and accomplishment of visual search tasks. This paper presents the results of an investigation that identifies the effects of the experimental stimuli positions and visual depth cues: luminance, texture, relative height and motion parallax on precise depth judgements made within a DVE. Results indicate that the experimental stimuli positions significantly affect precise depth judgements, texture is only significantly effective for certain conditions, and motion parallax, in line with previous results, is inconclusive to determine depth judgement accuracy for egocentrically viewed DVEs. Results also show that exocentric views, incorporating relative height and motion parallax visual cues, are effective for precise depth judgements made in DVEs. The results help us to understand the effects of certain visual depth cues to support the perception of frames of references and precise depth judgements, suggesting that the visual depth cues employed to create frames of references in DVEs may influence how effectively precise depth judgements are undertaken

Brunel University Research Archive

Referential precedents in spoken language comprehension: a review and meta-analysis

Author: Barr Dale J.
Kronmüller Edmundo
Publication venue: 'Elsevier BV'
Publication date: 01/08/2015
Field of study

Listeners’ interpretations of referring expressions are influenced by referential precedents—temporary conventions established in a discourse that associate linguistic expressions with referents. A number of psycholinguistic studies have investigated how much precedent effects depend on beliefs about the speaker’s perspective versus more egocentric, domain-general processes. We review and provide a meta-analysis of visual-world eyetracking studies of precedent use, focusing on three principal effects: (1) a same speaker advantage for maintained precedents; (2) a different speaker advantage for broken precedents; and (3) an overall main effect of precedents. Despite inconsistent claims in the literature, our combined analysis reveals surprisingly consistent evidence supporting the existence of all three effects, but with different temporal profiles. These findings carry important implications for existing theoretical explanations of precedent use, and challenge explanations based solely on the use of information about speakers’ perspectives

Enlighten

EGO-TOPO: Environment Affordances from Egocentric Video

Author: Feichtenhofer Christoph
Grauman Kristen
Li Yanghao
Nagarajan Tushar
Publication venue
Publication date: 27/03/2020
Field of study

First-person video naturally brings the use of a physical environment to the forefront, since it shows the camera wearer interacting fluidly in a space based on his intentions. However, current methods largely separate the observed actions from the persistent space itself. We introduce a model for environment affordances that is learned directly from egocentric video. The main idea is to gain a human-centric model of a physical space (such as a kitchen) that captures (1) the primary spatial zones of interaction and (2) the likely activities they support. Our approach decomposes a space into a topological map derived from first-person activity, organizing an ego-video into a series of visits to the different zones. Further, we show how to link zones across multiple related environments (e.g., from videos of multiple kitchens) to obtain a consolidated representation of environment functionality. On EPIC-Kitchens and EGTEA+, we demonstrate our approach for learning scene affordances and anticipating future actions in long-form video.Comment: Published in CVPR 2020, project page: http://vision.cs.utexas.edu/projects/ego-topo

arXiv.org e-Print Archive

Crossref

Challenges for identifying the neural mechanisms that support spatial navigation: the impact of spatial scale.

Author: Aginsky
Banta Lavenex
Barrash
Brandon
Burgess
Burgess
Burwell
Bush
Byrne
Cartwright
Cheng
Cheng
Cheung
Colby
Colgin
Committeri
Cushman
Derdikman
Derdikman
Doeller
Eichenbaum
Eldridge
Frank
Franz
Gheysen
Goodrich-Hunsaker
Hafting
Han
Hartley
HÃ¶lscher
Iaria
IglÃ³i
Jacobs
Jahn
Jan M. Wiener
Janzen
Janzen
Killian
Kuipers
Kumaran
Lever
McDonald
Meilinger
Meilinger
Montello
Morris
Oâ€™Keefe
Packard
Pearce
Pengas
Poucet
Rondi-Reig
Salmon
Sato
Schendan
Schinazi
Schindler
SchÃ¶lkopf
Shelton
Skaggs
Solstad
Spiers
Spiers
Steffenach
Stevens
Thomas Wolbers
Tolman
Trullier
Waller
Wang
Wang
Whitlock
Whitwell
Wiener
Wiener
Wiener
Wilton
Wolbers
Wolbers
Wolbers
Wood
Worden
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

Spatial navigation is a fascinating behavior that is essential for our everyday lives. It involves nearly all sensory systems, it requires numerous parallel computations, and it engages multiple memory systems. One of the key problems in this field pertains to the question of reference frames: spatial information such as direction or distance can be coded egocentrically-relative to an observer-or allocentrically-in a reference frame independent of the observer. While many studies have associated striatal and parietal circuits with egocentric coding and entorhinal/hippocampal circuits with allocentric coding, this strict dissociation is not in line with a growing body of experimental data. In this review, we discuss some of the problems that can arise when studying the neural mechanisms that are presumed to support different spatial reference frames. We argue that the scale of space in which a navigation task takes place plays a crucial role in determining the processes that are being recruited. This has important implications, particularly for the inferences that can be made from animal studies in small scale space about the neural mechanisms supporting human spatial navigation in large (environmental) spaces. Furthermore, we argue that many of the commonly used tasks to study spatial navigation and the underlying neuronal mechanisms involve different types of reference frames, which can complicate the interpretation of neurophysiological data

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Bournemouth University Research Online

Graph learning in robotics: a survey

Author: Averta Giuseppe
Pistilli Francesca
Publication venue
Publication date: 06/10/2023
Field of study

Deep neural networks for graphs have emerged as a powerful tool for learning on complex non-euclidean data, which is becoming increasingly common for a variety of different applications. Yet, although their potential has been widely recognised in the machine learning community, graph learning is largely unexplored for downstream tasks such as robotics applications. To fully unlock their potential, hence, we propose a review of graph neural architectures from a robotics perspective. The paper covers the fundamentals of graph-based models, including their architecture, training procedures, and applications. It also discusses recent advancements and challenges that arise in applied settings, related for example to the integration of perception, decision-making, and control. Finally, the paper provides an extensive review of various robotic applications that benefit from learning on graph structures, such as bodies and contacts modelling, robotic manipulation, action recognition, fleet motion planning, and many more. This survey aims to provide readers with a thorough understanding of the capabilities and limitations of graph neural architectures in robotics, and to highlight potential avenues for future research

arXiv.org e-Print Archive