Search CORE

3 research outputs found

World futures through RT’s eyes: multimodal dataset and interdisciplinary methodology

Author: Burenko Ilya
Pavlova Irina
Payne Elinor
Uhrig Peter
Wilson Anna
Publication venue: Frontiers Media
Publication date: 24/04/2024
Field of study

There is a need to develop new interdisciplinary approaches suitable for a more complete analysis of multimodal data. Such approaches need to go beyond case studies and leverage technology to allow for statistically valid analysis of the data. Our study addresses this need by engaging with the research question of how humans communicate about the future for persuasive and manipulative purposes, and how they do this multimodally. It introduces a new methodology for computer-assisted multimodal analysis of video data. The study also introduces the resulting dataset, featuring annotations for speech (textual and acoustic modalities) and gesticulation and corporal behaviour (visual modality). To analyse and annotate the data and develop the methodology, the study engages with 23 26-min episodes of the show ‘SophieCo Visionaries’, broadcast by RT (formerly ‘Russia Today’)

Oxford University Research Archive

Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough

Author: Avetisian Manvel
Burenko Ilya
Kokh Vladimir
Malkin Elian
Nazarov Ivan
Ponomarchuk Alexander
Zhukov Leonid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/01/2022
Field of study

The COVID-19 pandemic created a significant interest and demand for infection detection and monitoring solutions. In this paper we propose a machine learning method to quickly triage COVID-19 using recordings made on consumer devices. The approach combines signal processing methods with fine-tuned deep learning networks and provides methods for signal denoising, cough detection and classification. We have also developed and deployed a mobile application that uses symptoms checker together with voice, breath and cough signals to detect COVID-19 infection. The application showed robust performance on both open sourced datasets and on the noisy data collected during beta testing by the end users

arXiv.org e-Print Archive

Co-Speech Gesture Detection through Multi-phase Sequence Labeling

Author: Burenko Ilya
Fernández Raquel
Ghaleb Esam
Holler Judith
Pouw Wim
Rasenberg Marlou
Toni Ivan
Uhrig Peter
Özyürek Aslı
Publication venue
Publication date: 21/08/2023
Field of study

Gestures are integral components of face-to-face communication. They unfold over time, often following predictable movement phases of preparation, stroke, and retraction. Yet, the prevalent approach to automatic gesture detection treats the problem as binary classification, classifying a segment as either containing a gesture or not, thus failing to capture its inherently sequential and contextual nature. To address this, we introduce a novel framework that reframes the task as a multi-phase sequence labeling problem rather than binary classification. Our model processes sequences of skeletal movements over time windows, uses Transformer encoders to learn contextual embeddings, and leverages Conditional Random Fields to perform sequence labeling. We evaluate our proposal on a large dataset of diverse co-speech gestures in task-oriented face-to-face dialogues. The results consistently demonstrate that our method significantly outperforms strong baseline models in detecting gesture strokes. Furthermore, applying Transformer encoders to learn contextual embeddings from movement sequences substantially improves gesture unit detection. These results highlight our framework's capacity to capture the fine-grained dynamics of co-speech gesture phases, paving the way for more nuanced and accurate gesture detection and analysis

arXiv.org e-Print Archive