3 research outputs found
R-Clustering for Egocentric Video Segmentation
In this paper, we present a new method for egocentric video temporal
segmentation based on integrating a statistical mean change detector and
agglomerative clustering(AC) within an energy-minimization framework. Given the
tendency of most AC methods to oversegment video sequences when clustering
their frames, we combine the clustering with a concept drift detection
technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as
a statistical upper bound for the clustering-based video segmentation. We
integrate both techniques in an energy-minimization framework that serves to
disambiguate the decision of both techniques and to complete the segmentation
taking into account the temporal continuity of video frames descriptors. We
present experiments over egocentric sets of more than 13.000 images acquired
with different wearable cameras, showing that our method outperforms
state-of-the-art clustering methods
Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos
Wearable cameras offer a hands-free way to record egocentric images of daily
experiences, where social events are of special interest. The first step
towards detection of social events is to track the appearance of multiple
persons involved in it. In this paper, we propose a novel method to find
correspondences of multiple faces in low temporal resolution egocentric videos
acquired through a wearable camera. This kind of photo-stream imposes
additional challenges to the multi-tracking problem with respect to
conventional videos. Due to the free motion of the camera and to its low
temporal resolution, abrupt changes in the field of view, in illumination
condition and in the target location are highly frequent. To overcome such
difficulties, we propose a multi-face tracking method that generates a set of
tracklets through finding correspondences along the whole sequence for each
detected face and takes advantage of the tracklets redundancy to deal with
unreliable ones. Similar tracklets are grouped into the so called extended
bag-of-tracklets (eBoT), which is aimed to correspond to a specific person.
Finally, a prototype tracklet is extracted for each eBoT, where the occurred
occlusions are estimated by relying on a new measure of confidence. We
validated our approach over an extensive dataset of egocentric photo-streams
and compared it to state of the art methods, demonstrating its effectiveness
and robustness.Comment: 27 pages, 18 figures, submitted to computer vision and image
understanding journa
Towards Storytelling from Visual Lifelogging: An Overview
Visual lifelogging consists of acquiring images that capture the daily
experiences of the user by wearing a camera over a long period of time. The
pictures taken offer considerable potential for knowledge mining concerning how
people live their lives, hence, they open up new opportunities for many
potential applications in fields including healthcare, security, leisure and
the quantified self. However, automatically building a story from a huge
collection of unstructured egocentric data presents major challenges. This
paper provides a thorough review of advances made so far in egocentric data
analysis, and in view of the current state of the art, indicates new lines of
research to move us towards storytelling from visual lifelogging.Comment: 16 pages, 11 figures, Submitted to IEEE Transactions on Human-Machine
System