67 research outputs found
Visual Summary of Egocentric Photostreams by Representative Keyframes
Building a visual summary from an egocentric photostream captured by a
lifelogging wearable camera is of high interest for different applications
(e.g. memory reinforcement). In this paper, we propose a new summarization
method based on keyframes selection that uses visual features extracted by
means of a convolutional neural network. Our method applies an unsupervised
clustering for dividing the photostreams into events, and finally extracts the
most relevant keyframe for each event. We assess the results by applying a
blind-taste test on a group of 20 people who assessed the quality of the
summaries.Comment: Paper accepted in the IEEE First International Workshop on Wearable
and Ego-vision Systems for Augmented Experience (WEsAX). Turin, Italy. July
3, 201
LEMoRe: A lifelog engine for moments retrieval at the NTCIR-lifelog LSAT task
Semantic image retrieval from large amounts of egocentric visual data requires to leverage powerful techniques for filling in the semantic gap. This paper introduces LEMoRe, a Lifelog Engine for Moments Retrieval, developed in the context of the Lifelog Semantic Access Task (LSAT) of the the NTCIR-12 challenge and discusses its performance variation on different trials. LEMoRe integrates classical image descriptors with high-level semantic concepts extracted by Convolutional Neural Networks (CNN), powered by a graphic user interface that uses natural language processing. Although this is just a first attempt towards interactive image retrieval from large egocentric datasets and there is a large room for improvement of the system components and the user interface, the structure of the system itself and the way the single components cooperate are very promising.Postprint (published version
Automatic Labeling Application Applied to Food-Related Object Recognition
Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2013, Director: Petia RadevaIt is clear that every day that goes by, technology is a little bit more present in our everyday life. Pervasive Computing (Ubiquitous Computing) is a fact that every one must assume, and even though ethical and moral topics will always arise. There are unlimited aspects and usual tasks in which Pervasive Computing can improve our quality of life.
One of the ways in which this recently emerging field can help us the most on making the quality of life of people a lot better, is on our feeding habits and all their related aspects: nutrition, physical activities, emotions and social interaction. Some of these environments are what this project intends to improve.
One of the most evident problems for which we could be interested in logging every bit of the diet of one person would be overweight. People who clearly need help on their nutrition and all the habits related to it (like physical activities, emotions and social interaction), could get an incredible benefit. Using a device or some interconnected devices which are able to monitor and record different kind of information, could help them overcome their habits, and solve their weight problems. But, ultimately, every person, even without any evident nutrition problem, could also take a great advantage from a device like that
Ego-Object Discovery in Lifelogging Datasets
En aquest treball proposem un mètode semi-supervisat per el descobriment d'objectes rellevants en seqüències d'imatges adquirides amb càmeres passives portàtils. Addicionalment, presentem un nou dataset d'imatges anotades egocèntriques sobre les quals (entre d'altres) fem comparació dels resultats
Egocentric video description based on temporally-linked sequences
Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1339 events with 3991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description
Egocentric video description based on temporally-linked sequences
[EN] Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures.
In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1339 events with 3991 descriptions, from 55¿days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.This work was partially founded by TIN2015-66951-C2, SGR 1219, CERCA, Grant 20141510 (Marato TV3), PrometeoII/2014/030 and R-MIPRCV network (TIN2014-54728-REDC). Petia Radeva is partially founded by ICREA Academia'2014. Marc Bolanos is partially founded by an FPU fellowship. We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan X GPU used for this research. The funders had no role in the study design, data collection, analysis, and preparation of the manuscript.Bolaños, M.; Peris-Abril, Á.; Casacuberta Nolla, F.; Soler, S.; Radeva, P. (2018). Egocentric video description based on temporally-linked sequences. Journal of Visual Communication and Image Representation. 50:205-216. https://doi.org/10.1016/j.jvcir.2017.11.022S2052165
- …