Search CORE

75 research outputs found

3rd International Workshop on Multisensory Approaches to Human-Food Interaction

Author: Nijholt Anton
Obrist Marianna
Okajima Katsunori
Spence Charles
Velasco Carlos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/10/2018
Field of study

This is the introduction paper to the third version of the workshop on 'Multisensory Approaches to Human-Food Interaction' organized at the 20th ACM International Conference on Multimodal Interaction in Boulder, Colorado, on October 16th, 2018. This workshop is a space where the fast growing research on Multisensory Human-Food Interaction is presented. Here we summarize the workshop's key objectives and contributions

Crossref

University of Twente Research Information

Keep Me in the Loop: Increasing Operator Situation Awareness through a Conversational Multimodal Interface

Author: Garcia Francisco Javier Chiyah
Hastie Helen
Laskov Atanas
Liu Xingkun
Patron Pedro
Robb David A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/10/2018
Field of study

Heriot Watt Pure

Crossref

Edinburgh Research Explorer

EmotiW 2018: Audio-Video, Student Engagement and Group-Level Affect Prediction

Author: Dhall Abhinav
Gedeon Tom
Goecke Roland
Kaur Amanjot
Publication venue
Publication date: 23/08/2018
Field of study

This paper details the sixth Emotion Recognition in the Wild (EmotiW) challenge. EmotiW 2018 is a grand challenge in the ACM International Conference on Multimodal Interaction 2018, Colorado, USA. The challenge aims at providing a common platform to researchers working in the affective computing community to benchmark their algorithms on `in the wild' data. This year EmotiW contains three sub-challenges: a) Audio-video based emotion recognition; b) Student engagement prediction; and c) Group-level emotion recognition. The databases, protocols and baselines are discussed in detail

arXiv.org e-Print Archive

University of Canberra Research Repository

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

Author: Chung J. S.
Chung Joon Son
He Kaiming
Ngiam Jiquan
Potamianos G.
Rajagopalan Shyam Sundar
Reddi Sashank J.
Ren Jimmy
Stafylakis Themos
Wand Michael
Zadeh Amir
Zadeh Amir
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/05/2019
Field of study

Automatic speech recognition can potentially benefit from the lip motion patterns, complementing acoustic speech to improve the overall recognition performance, particularly in noise. In this paper we propose an audio-visual fusion strategy that goes beyond simple feature concatenation and learns to automatically align the two modalities, leading to enhanced representations which increase the recognition accuracy in both clean and noisy conditions. We test our strategy on the TCD-TIMIT and LRS2 datasets, designed for large vocabulary continuous speech recognition, applying three types of noise at different power ratios. We also exploit state of the art Sequence-to-Sequence architectures, showing that our method can be easily integrated. Results show relative improvements from 7% up to 30% on TCD-TIMIT over the acoustic modality alone, depending on the acoustic noise level. We anticipate that the fusion strategy can easily generalise to many other multimodal tasks which involve correlated modalities. Code available online on GitHub: https://github.com/georgesterpu/Sigmedia-AVSRComment: In ICMI'18, October 16-20, 2018, Boulder, CO, USA. Equation (2) corrected on this versio

arXiv.org e-Print Archive

Crossref

Group Interaction Frontiers in Technology

Author: Hung Hayley
Keyton Joann
Lai Catherine
Lehmann-Willenbrock Nale
Murray Gabriel
Oertel Catherine
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Over the last decade, the study of group behavior for multimodal interaction technologies has increased. However, we believe that despite its potential benefits on society, there could be more activity in this area. The aim of this workshop is create a forum for more interdisciplinary dialogue on this topic to enable the acceleration of growth. The workshop has been very successful in attracting submissions addressing important facets in the context of technologies for analyzing and aiding groups. This paper provides a summary of the activities of the workshop and the accepted papers

Infoscience - École polytechnique fédérale de Lausanne

TU Delft Repository

Edinburgh Research Explorer

I Smell Trouble: Using Multiple Scents To Convey Driving-Relevant Information

Author: Dmitrenko D
Maggioni E
Obrist M
Publication venue: 20th ACM International Conference on Multimodal Interaction (ICMI '18)
Publication date: 02/10/2018
Field of study

Cars provide drivers with task-related information (e.g. "Fill gas") mainly using visual and auditory stimuli. However, those stimuli may distract or overwhelm the driver, causing unnecessary stress. Here, we propose olfactory stimulation as a novel feedback modality to support the perception of visual notifications, reducing the visual demand of the driver. Based on previous research, we explore the application of the scents of lavender, peppermint, and lemon to convey three driving-relevant messages (i.e. "Slow down", "Short inter-vehicle distance", "Lane departure"). Our paper is the first to demonstrate the application of olfactory conditioning in the context of driving and to explore how multiple olfactory notifications change the driving behaviour. Our findings demonstrate that olfactory notifications are perceived as less distracting, more comfortable, and more helpful than visual notifications. Drivers also make less driving mistakes when exposed to olfactory notifications. We discuss how these findings inform the design of future in-car user interfaces

UCL Discovery

Do I Have Your Attention: A Large Scale Engagement Prediction Dataset and Baselines

Author: Dhall Abhinav
Hoque Ximi
Ikeda Kazushi
Singh Monisha
Wang Yanan
Zeng Donghuo
Publication venue
Publication date: 17/08/2023
Field of study

The degree of concentration, enthusiasm, optimism, and passion displayed by individual(s) while interacting with a machine is referred to as `user engagement'. Engagement comprises of behavioral, cognitive, and affect related cues. To create engagement prediction systems that can work in real-world conditions, it is quintessential to learn from rich, diverse datasets. To this end, a large scale multi-faceted engagement in the wild dataset EngageNet is proposed. 31 hours duration data of 127 participants representing different illumination conditions are recorded. Thorough experiments are performed exploring the applicability of different features, action units, eye gaze, head pose, and MARLIN. Data from user interactions (question-answer) are analyzed to understand the relationship between effective learning and user engagement. To further validate the rich nature of the dataset, evaluation is also performed on the EngageWild dataset. The experiments show the usefulness of the proposed dataset. The code, models, and dataset link are publicly available at https://github.com/engagenet/engagenet_baselines

arXiv.org e-Print Archive