5,738 research outputs found
Recognizing food places in egocentric photo-streams using multi-scale atrous convolutional networks and self-attention mechanism.
Wearable sensors (e.g., lifelogging cameras) represent very useful tools to monitor people's daily habits and lifestyle. Wearable cameras are able to continuously capture different moments of the day of their wearers, their environment, and interactions with objects, people, and places reflecting their personal lifestyle. The food places where people eat, drink, and buy food, such as restaurants, bars, and supermarkets, can directly affect their daily dietary intake and behavior. Consequently, developing an automated monitoring system based on analyzing a person's food habits from daily recorded egocentric photo-streams of the food places can provide valuable means for people to improve their eating habits. This can be done by generating a detailed report of the time spent in specific food places by classifying the captured food place images to different groups. In this paper, we propose a self-attention mechanism with multi-scale atrous convolutional networks to generate discriminative features from image streams to recognize a predetermined set of food place categories. We apply our model on an egocentric food place dataset called 'EgoFoodPlaces' that comprises of 43 392 images captured by 16 individuals using a lifelogging camera. The proposed model achieved an overall classification accuracy of 80% on the 'EgoFoodPlaces' dataset, respectively, outperforming the baseline methods, such as VGG16, ResNet50, and InceptionV3
Eyewear Computing \u2013 Augmenting the Human with Head-Mounted Wearable Assistants
The seminar was composed of workshops and tutorials on head-mounted eye tracking, egocentric
vision, optics, and head-mounted displays. The seminar welcomed 30 academic and industry
researchers from Europe, the US, and Asia with a diverse background, including wearable and
ubiquitous computing, computer vision, developmental psychology, optics, and human-computer
interaction. In contrast to several previous Dagstuhl seminars, we used an ignite talk format to
reduce the time of talks to one half-day and to leave the rest of the week for hands-on sessions,
group work, general discussions, and socialising. The key results of this seminar are 1) the
identification of key research challenges and summaries of breakout groups on multimodal eyewear
computing, egocentric vision, security and privacy issues, skill augmentation and task guidance,
eyewear computing for gaming, as well as prototyping of VR applications, 2) a list of datasets and
research tools for eyewear computing, 3) three small-scale datasets recorded during the seminar, 4)
an article in ACM Interactions entitled \u201cEyewear Computers for Human-Computer Interaction\u201d,
as well as 5) two follow-up workshops on \u201cEgocentric Perception, Interaction, and Computing\u201d
at the European Conference on Computer Vision (ECCV) as well as \u201cEyewear Computing\u201d at
the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain
Wearable cameras allow to acquire images and videos from the user's
perspective. These data can be processed to understand humans behavior. Despite
human behavior analysis has been thoroughly investigated in third person
vision, it is still understudied in egocentric settings and in particular in
industrial scenarios. To encourage research in this field, we present MECCANO,
a multimodal dataset of egocentric videos to study humans behavior
understanding in industrial-like settings. The multimodality is characterized
by the presence of gaze signals, depth maps and RGB videos acquired
simultaneously with a custom headset. The dataset has been explicitly labeled
for fundamental tasks in the context of human behavior understanding from a
first person view, such as recognizing and anticipating human-object
interactions. With the MECCANO dataset, we explored five different tasks
including 1) Action Recognition, 2) Active Objects Detection and Recognition,
3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and
5) Next-Active Objects Detection. We propose a benchmark aimed to study human
behavior in the considered industrial-like scenario which demonstrates that the
investigated tasks and the considered scenario are challenging for
state-of-the-art algorithms. To support research in this field, we publicy
release the dataset at https://iplab.dmi.unict.it/MECCANO/.Comment: arXiv admin note: text overlap with arXiv:2010.0565
Topic modelling for routine discovery from egocentric photo-streams
Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals' lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed
Recommended from our members
Proxemic Flow: Dynamic Peripheral Floor Visualizations for Revealing and Mediating Large Surface Interactions
Interactive large surfaces have recently become commonplace for interactions in public settings. The fact that people can engage with them and the spectrum of possible interactions, however, often remain invisible and can be confusing or ambiguous to passersby. In this paper, we explore the design of dynamic peripheral floor visualizations for revealing and mediating large surface interactions. Extending earlier work on interactive illuminated floors, we introduce a novel approach for leveraging floor displays in a secondary, assisting role to aid users in interacting with the primary display. We illustrate a series of visualizations with the illuminated floor of the Proxemic Flow system. In particular, we contribute a design space for peripheral floor visualizations that (a) provides peripheral information about tracking fidelity with personal halos, (b) makes interaction zones and borders explicit for easy opt-in and opt-out, and (c) gives cues inviting for spatial movement or possible next interaction steps through wave, trail, and footstep animations. We demonstrate our proposed techniques in the context of a large surface application and discuss important design considerations for assistive floor visualizations
- …