Search CORE

304 research outputs found

Multimodal Pipeline:A generic approach for handling multimodal data for supporting learning

Author: Di Mitri D.
Drachsler H.J.
Schneider Jan
Specht M.M.
Publication venue
Publication date: 10/08/2019
Field of study

Open University of the Netherlands Research Portal

ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions in the Wild

Author: Gedik Ekin
Hung Hayley
Islam Ashraful
Raman Chirag
Tan Stephanie
Vargas-Quiros Jose
Publication venue
Publication date: 23/07/2022
Field of study

Recording the dynamics of unscripted human interactions in the wild is challenging due to the delicate trade-offs between several factors: participant privacy, ecological validity, data fidelity, and logistical overheads. To address these, following a 'datasets for the community by the community' ethos, we propose the Conference Living Lab (ConfLab): a new concept for multimodal multisensor data collection of in-the-wild free-standing social conversations. For the first instantiation of ConfLab described here, we organized a real-life professional networking event at a major international conference. Involving 48 conference attendees, the dataset captures a diverse mix of status, acquaintance, and networking motivations. Our capture setup improves upon the data fidelity of prior in-the-wild datasets while retaining privacy sensitivity: 8 videos (1920x1080, 60 fps) from a non-invasive overhead view, and custom wearable sensors with onboard recording of body motion (full 9-axis IMU), privacy-preserving low-frequency audio (1250 Hz), and Bluetooth-based proximity. Additionally, we developed custom solutions for distributed hardware synchronization at acquisition, and time-efficient continuous annotation of body keypoints and actions at high sampling rates. Our benchmarks showcase some of the open research tasks related to in-the-wild privacy-preserving social data analysis: keypoints detection from overhead camera views, skeleton-based no-audio speaker detection, and F-formation detection.Comment: v2 is the version submitted to Neurips 2022 Datasets and Benchmarks Trac

arXiv.org e-Print Archive

Recommended from our members

High reliability Android application for multidevice multimodal mobile data acquisition and annotation

Author: Ciliberto Mathias
Gjoreski Hristijan
Mekki Sami
Ordonez Morales Francisco Javier
Roggen Daniel
Valentin Stefan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/11/2017
Field of study

We have completed the collection of one of the richest accurately annotated mobile dataset of modes of transportation and locomotion. To do this, we developed a highly reliable Android application called DataLogger capable of recording multisensor data from multiple synchronized smartphones simultaneously. The application allows real-time data annotation. We explain how we designed the app to achieve high reliability and ease of use. We also present an evaluation of the application in a big-data collection (750 hours, 950 GB of data, 17 different sensor modalities), analysing the data loss (less than 0.4‰) and battery consumption (≈6% on average per hour). The application is available as open source

Sussex Research Online

Protocol for PD SENSORS:Parkinson’s Disease Symptom Evaluation in a Naturalistic Setting producing Outcomes measuRes using SPHERE technology. An observational feasibility study of multi-modal multi-sensor technology to measure symptoms and activities of daily living in Parkinson’s disease

Author: Carey Julia
Craddock Ian
Eardley Rachel
Heidarivincheh Farnoosh
Horne Alison
Kinnunen Kirsi M
Maetzler Walter
Matthews Helen
McConville Ryan
McNaney Roisin
Mirmehdi Majid
Morgan Catherine
Rochester Lynn
Rolinski Michal
Tonkin Emma L.
Watson Oliver
Whitehouse Sam
Whone Alan L
Publication venue: 'BMJ'
Publication date: 01/11/2020
Field of study

Introduction The impact of disease-modifying agents on disease progression in Parkinson’s disease is largely assessed in clinical trials using clinical rating scales. These scales have drawbacks in terms of their ability to capture the fluctuating nature of symptoms while living in a naturalistic environment. The SPHERE (Sensor Platform for HEalthcare in a Residential Environment) project has designed a multi-sensor platform with multimodal devices designed to allow continuous, relatively inexpensive, unobtrusive sensing of motor, non-motor and activities of daily living metrics in a home or a home-like environment. The aim of this study is to evaluate how the SPHERE technology can measure aspects of Parkinson’s disease.Methods and analysis This is a small-scale feasibility and acceptability study during which 12 pairs of participants (comprising a person with Parkinson’s and a healthy control participant) will stay and live freely for 5 days in a home-like environment embedded with SPHERE technology including environmental, appliance monitoring, wrist-worn accelerometry and camera sensors. These data will be collected alongside clinical rating scales, participant diary entries and expert clinician annotations of colour video images. Machine learning will be used to look for a signal to discriminate between Parkinson’s disease and control, and between Parkinson’s disease symptoms ‘on’ and ‘off’ medications. Additional outcome measures including bradykinesia, activity level, sleep parameters and some activities of daily living will be explored. Acceptability of the technology will be evaluated qualitatively using semi-structured interviews.Ethics and dissemination Ethical approval has been given to commence this study; the results will be disseminated as widely as appropriate

Directory of Open Access Journals

Explore Bristol Research

Real-Time Management of Multimodal Streaming Data for Monitoring of Epileptic Patients

Author: Megalooikonomou Vasileios
Mporas Iosif
Triantafyllopoulos Dimitrios
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2016
Field of study

This is the Accepted Manuscript version of the following article: I. Mporas, D. Triantafyllopoulos, V. Megalooikonomou, “Real-Time Management of Multimodal Streaming Data for Monitoring of Epileptic Patients”, Journal of Medical Systems, Vol. 40(45), December 2015. The final published versions is available at: https://link.springer.com/article/10.1007%2Fs10916-015-0403-3 © Springer Science+Business Media New York 2015.New generation of healthcare is represented by wearable health monitoring systems, which provide real-time monitoring of patient’s physiological parameters. It is expected that continuous ambulatory monitoring of vital signals will improve treatment of patients and enable proactive personal health management. In this paper, we present the implementation of a multimodal real-time system for epilepsy management. The proposed methodology is based on a data streaming architecture and efficient management of a big flow of physiological parameters. The performance of this architecture is examined for varying spatial resolution of the recorded data.Peer reviewedFinal Accepted Versio

University of Hertfordshire Research Archive

The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences

Author: Di Mitri D.
Publication venue: Open Universiteit
Publication date: 04/09/2020
Field of study

This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks

Open University of the Netherlands Research Portal

End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models

Author: Busso Carlos
Tao Fei
Publication venue
Publication date: 12/09/2018
Field of study

Speech activity detection (SAD) plays an important role in current speech processing systems, including automatic speech recognition (ASR). SAD is particularly difficult in environments with acoustic noise. A practical solution is to incorporate visual information, increasing the robustness of the SAD approach. An audiovisual system has the advantage of being robust to different speech modes (e.g., whisper speech) or background noise. Recent advances in audiovisual speech processing using deep learning have opened opportunities to capture in a principled way the temporal relationships between acoustic and visual features. This study explores this idea proposing a \emph{bimodal recurrent neural network} (BRNN) framework for SAD. The approach models the temporal dynamic of the sequential audiovisual data, improving the accuracy and robustness of the proposed SAD system. Instead of estimating hand-crafted features, the study investigates an end-to-end training approach, where acoustic and visual features are directly learned from the raw data during training. The experimental evaluation considers a large audiovisual corpus with over 60.8 hours of recordings, collected from 105 speakers. The results demonstrate that the proposed framework leads to absolute improvements up to 1.2% under practical scenarios over a VAD baseline using only audio implemented with deep neural network (DNN). The proposed approach achieves 92.7% F1-score when it is evaluated using the sensors from a portable tablet under noisy acoustic environment, which is only 1.0% lower than the performance obtained under ideal conditions (e.g., clean speech obtained with a high definition camera and a close-talking microphone).Comment: Submitted to Speech Communicatio

arXiv.org e-Print Archive

A transparent framework towards the context-sensitive recognition of conversational engagement

Author: André Elisabeth
Baur Tobias
Heimerl Alexander
Publication venue
Publication date: 02/02/2023
Field of study

Modelling and recognising affective and mental user states is an urging topic in multiple research fields. This work suggests an approach towards adequate recognition of such states by combining state-of-the-art behaviour recognition classifiers in a transparent and explainable modelling framework that also allows to consider contextual aspects in the inference process. More precisely, in this paper we exemplify the idea of our framework with the recognition of conversational engagement in bi-directional conversations. We introduce a multi-modal annotation scheme for conversational engagement. We further introduce our hybrid approach that combines the accuracy of state-of-the art machine learning techniques, such as deep learning, with the capabilities of Bayesian Networks that are inherently interpretable and feature an important aspect that modern approaches are lacking - causal inference. In an evaluation on a large multi-modal corpus of bi-directional conversations, we show that this hybrid approach can even outperform state-of-the-art black-box approaches by considering context information and causal relations

OPUS Augsburg

Toward Emotion Recognition From Physiological Signals in the Wild: Approaching the Methodological Issues in Real-Life Data Collection

Author: Barresi G.
Caldwell D. G.
Larradet F.
Mattos L. S.
Niewiadomski R.
Publication venue: Frontiers Media S.A.
Publication date: 01/01/2020
Field of study

Emotion, mood, and stress recognition (EMSR) has been studied in laboratory settings for decades. In particular, physiological signals are widely used to detect and classify affective states in lab conditions. However, physiological reactions to emotional stimuli have been found to differ in laboratory and natural settings. Thanks to recent technological progress (e.g., in wearables) the creation of EMSR systems for a large number of consumers during their everyday activities is increasingly possible. Therefore, datasets created in the wild are needed to insure the validity and the exploitability of EMSR models for real-life applications. In this paper, we initially present common techniques used in laboratory settings to induce emotions for the purpose of physiological dataset creation. Next, advantages and challenges of data collection in the wild are discussed. To assess the applicability of existing datasets to real-life applications, we propose a set of categories to guide and compare at a glance different methodologies used by researchers to collect such data. For this purpose, we also introduce a visual tool called Graphical Assessment of Real-life Application-Focused Emotional Dataset (GARAFED). In the last part of the paper, we apply the proposed tool to compare existing physiological datasets for EMSR in the wild and to show possible improvements and future directions of research. We wish for this paper and GARAFED to be used as guidelines for researchers and developers who aim at collecting affect-related data for real-life EMSR-based applications

Archivio istituzionale della ricerca - Università di Genova