Search CORE

62 research outputs found

Predicting visual context for unsupervised event segmentation in continuous photo-streams

Author: Bolanos Marc
Dang-Nguyen Duc-Tien
del Molino Ana Garcia
del Molino Ana Garcia
Gygli Michael
Lee Yong Jae
Lin Jie
Lin Wei-Hao
Ng Hamg Wei
Srivastava Nitish
Yamamoto Shuhei
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2018
Field of study

Segmenting video content into events provides semantic structures for indexing, retrieval, and summarization. Since motion cues are not available in continuous photo-streams, and annotations in lifelogging are scarce and costly, the frames are usually clustered into events by comparing the visual features between them in an unsupervised way. However, such methodologies are ineffective to deal with heterogeneous events, e.g. taking a walk, and temporary changes in the sight direction, e.g. at a meeting. To address these limitations, we propose Contextual Event Segmentation (CES), a novel segmentation paradigm that uses an LSTM-based generative network to model the photo-stream sequences, predict their visual context, and track their evolution. CES decides whether a frame is an event boundary by comparing the visual context generated from the frames in the past, to the visual context predicted from the future. We implemented CES on a new and massive lifelogging dataset consisting of more than 1.5 million images spanning over 1,723 days. Experiments on the popular EDUB-Seg dataset show that our model outperforms the state-of-the-art by over 16% in f-measure. Furthermore, CES' performance is only 3 points below that of human annotators.Comment: Accepted for publication at the 2018 ACM Multimedia Conference (MM '18

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Ego-Object Discovery in Lifelogging Datasets

Author: Bolaños Solà Marc
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

En aquest treball proposem un mètode semi-supervisat per el descobriment d'objectes rellevants en seqüències d'imatges adquirides amb càmeres passives portàtils. Addicionalment, presentem un nou dataset d'imatges anotades egocèntriques sobre les quals (entre d'altres) fem comparació dels resultats

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Evolution of First Person Vision Methods: A Survey

Author: Betancourt Alejandro
Morerio Pietro
Rauterberg Matthias
Regazzoni Carlo S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The emergence of new wearable technologies such as action cameras and smart-glasses has increased the interest of computer vision scientists in the First Person perspective. Nowadays, this field is attracting attention and investments of companies aiming to develop commercial devices with First Person Vision recording capabilities. Due to this interest, an increasing demand of methods to process these videos, possibly in real-time, is expected. Current approaches present a particular combinations of different image features and quantitative methods to accomplish specific objectives like object detection, activity recognition, user machine interaction and so on. This paper summarizes the evolution of the state of the art in First Person Vision video analysis between 1997 and 2014, highlighting, among others, most commonly used features, methods, challenges and opportunities within the field.Comment: First Person Vision, Egocentric Vision, Wearable Devices, Smart Glasses, Computer Vision, Video Analytics, Human-machine Interactio

arXiv.org e-Print Archive

CiteSeerX

Crossref

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Genova

Learning and mining from personal digital archives

Author: Li Na
Publication venue: Dublin City University. Scientific Computing and Complex Systems Modelling (Sci-Sym)
Publication date: 01/03/2020
Field of study

Given the explosion of new sensing technologies, data storage has become significantly cheaper and consequently, people increasingly rely on wearable devices to create personal digital archives. Lifelogging is the act of recording aspects of life in digital format for a variety of purposes such as aiding human memory, analysing human lifestyle and diet monitoring. In this dissertation we are concerned with Visual Lifelogging, a form of lifelogging based on the passive capture of photographs by a wearable camera. Cameras, such as Microsoft's SenseCam can record up to 4,000 images per day as well as logging data from several incorporated sensors. Considering the volume, complexity and heterogeneous nature of such data collections, it is a signifcant challenge to interpret and extract knowledge for the practical use of lifeloggers and others. In this dissertation, time series analysis methods have been used to identify and extract useful information from temporal lifelogging images data, without benefit of prior knowledge. We focus, in particular, on three fundamental topics: noise reduction, structure and characterization of the raw data; the detection of multi-scale patterns; and the mining of important, previously unknown repeated patterns in the time series of lifelog image data. Firstly, we show that Detrended Fluctuation Analysis (DFA) highlights the feature of very high correlation in lifelogging image collections. Secondly, we show that study of equal-time Cross-Correlation Matrix demonstrates atypical or non-stationary characteristics in these images. Next, noise reduction in the Cross-Correlation Matrix is addressed by Random Matrix Theory (RMT) before Wavelet multiscaling is used to characterize the `most important' or `unusual' events through analysis of the associated dynamics of the eigenspectrum. A motif discovery technique is explored for detection of recurring and recognizable episodes of an individual's image data. Finally, we apply these motif discovery techniques to two known lifelog data collections, All I Have Seen (AIHS) and NTCIR-12 Lifelog, in order to examine multivariate recurrent patterns of multiple-lifelogging users

Irish Universities

DCU Online Research Access Service

Classification and Comparison of On-Line Video Summarisation Methods

Author: Kuncheva Ludmila I.
Matthews Clare E.
Yousefi Paria
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Bangor University Research Portal

An Outlook into the Future of Egocentric Vision

Author: Bansal Siddhant
Damen Dima
Farinella Giovanni Maria
Furnari Antonino
Goletto Gabriele
Plizzari Chiara
Ragusa Francesco
Tommasi Tatiana
Publication venue
Publication date: 14/08/2023
Field of study

What will the future be? We wonder! In this survey, we explore the gap between current research in egocentric vision and the ever-anticipated future, where wearable computing, with outward facing cameras and digital overlays, is expected to be integrated in our every day lives. To understand this gap, the article starts by envisaging the future through character-based stories, showcasing through examples the limitations of current technology. We then provide a mapping between this future and previously defined research tasks. For each task, we survey its seminal works, current state-of-the-art methodologies and available datasets, then reflect on shortcomings that limit its applicability to future research. Note that this survey focuses on software models for egocentric vision, independent of any specific hardware. The paper concludes with recommendations for areas of immediate explorations so as to unlock our path to the future always-on, personalised and life-enhancing egocentric vision.Comment: We invite comments, suggestions and corrections here: https://openreview.net/forum?id=V3974SUk1

arXiv.org e-Print Archive

Analysis of the hands in egocentric vision: A survey

Author: Bandini Andrea
Zariffa José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Egocentric vision (a.k.a. first-person vision - FPV) applications have thrived over the past few years, thanks to the availability of affordable wearable cameras and large annotated datasets. The position of the wearable camera (usually mounted on the head) allows recording exactly what the camera wearers have in front of them, in particular hands and manipulated objects. This intrinsic advantage enables the study of the hands from multiple perspectives: localizing hands and their parts within the images; understanding what actions and activities the hands are involved in; and developing human-computer interfaces that rely on hand gestures. In this survey, we review the literature that focuses on the hands using egocentric vision, categorizing the existing approaches into: localization (where are the hands or parts of them?); interpretation (what are the hands doing?); and application (e.g., systems that used egocentric hand cues for solving a specific problem). Moreover, a list of the most prominent datasets with hand-based annotations is provided

arXiv.org e-Print Archive

Archivio della ricerca della Scuola Superiore Sant'Anna