Search CORE

117 research outputs found

A pipeline for the creation of multimodal corpora from YouTube videos

Author: Dykes Nathan
Uhrig Peter
Wilson Anna
Publication venue: Association for Computational Lingustics
Publication date: 01/09/2023
Field of study

This paper introduces an open-source pipeline for the creation of multimodal corpora from YouTube videos. It minimizes storage and bandwidth requirements, because the videos themselves need not be downloaded and can remain on YouTube’s servers. It also minimizes processing requirements by using YouTube’s automatically generated subtitles, thus avoiding a computationally expensive automatic speech recognition processing step. The pipeline combines standard tools and provides as its output a corpus file in the industry-standard vertical format used by many corpus managers. It is straightforwardly extensible with the addition of further levels of annotation and can be adapted to languages other than English

Oxford University Research Archive

Evaluation of Efficiency-Enhancing Measures Using Optimization Algorithms for Fuel Cell Vehicles

Author: Kurzweil Peter
Säger Peter
Uhrig Florian
von Unwerth Thomas
Publication venue
Publication date: 25/11/2019
Field of study

Efficiency-enhancing measures are evaluated for a serial hybrid fuel cell vehicle over a drive cycle. The regarded powertrain consists of fuel cell system, battery, DC-DC converter, inverter and electrical machine. Within the fuel cell system, the air supply is the largest parasitic load. For the lowest dissipation, different air compression architectures are optimized by a scaling algorithm and compared. Phase switching reduces DC-DC losses. Additionally, a variable DC-link voltage increases efficiency of electrical machine and inverter. Dynamic Programming (DP) is used to evaluate these measures. The DP was extended by start-up and shutdown energy of the fuel cell system to model realistic cycle consumptions. Finally, all these efficiency enhancing measures lead to a reduction of energy consumption by 6.4 % for the serial hybrid fuel cell vehicle over a drive cycle

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Multimedia ONline ARchiv CHemnitz

The development and implementation of a coding scheme to analyse interview dynamics in the British Household Panel Survey

Author: Lynn Peter
Sala Emanuela
Uhrig S.C. Noah
Publication venue: Colchester: University of Essex, Institute for Social and Economic Research (ISER)
Publication date: 01/01/2008
Field of study

The study of interviewer-respondent interaction that occurs during an interview can give very useful insights into the cognitive process of answering questions, the social dynamics that develop in an interview context and the way these dynamics ultimately impact data quality. Behaviour coding is a technique used to code such interactions. Despite its long-standing use, little is written about the procedures to be followed while developing a coding scheme. This paper provides a practical background on the development and implementation of the behaviour coding scheme adopted to explore interview dynamics in the framework of dependent interviewing. This schema was used to code approximately 150 previously transcribed interviews of the British Household Panel Study Wave 16 pilot. Coding strategies and procedures, coder recruitment and training reliability assessments as well as timetable and costs are documented and discussed

EconStor (ZBW Kiel)

World futures through RT’s eyes: multimodal dataset and interdisciplinary methodology

Author: Burenko Ilya
Pavlova Irina
Payne Elinor
Uhrig Peter
Wilson Anna
Publication venue: Frontiers Media
Publication date: 24/04/2024
Field of study

There is a need to develop new interdisciplinary approaches suitable for a more complete analysis of multimodal data. Such approaches need to go beyond case studies and leverage technology to allow for statistically valid analysis of the data. Our study addresses this need by engaging with the research question of how humans communicate about the future for persuasive and manipulative purposes, and how they do this multimodally. It introduces a new methodology for computer-assisted multimodal analysis of video data. The study also introduces the resulting dataset, featuring annotations for speech (textual and acoustic modalities) and gesticulation and corporal behaviour (visual modality). To analyse and annotate the data and develop the methodology, the study engages with 23 26-min episodes of the show ‘SophieCo Visionaries’, broadcast by RT (formerly ‘Russia Today’)

Oxford University Research Archive

World futures through RT’s eyes: multimodal dataset and interdisciplinary methodology

Author: Anna Wilson
Elinor Payne
Ilya Burenko
Ilya Burenko
Irina Pavlova
Peter Uhrig
Peter Uhrig
Publication venue: Frontiers Media S.A.
Publication date: 01/04/2024
Field of study

Directory of Open Access Journals

Gesture retrieval and its application to the study of multimodal communication

Author: Dupont Stephane
Parian-Scherb Mahnaz
Rossetto Luca
Schuldt Heiko
Uhrig Peter
Publication venue: Springer
Publication date: 24/07/2023
Field of study

Comprehending communication is dependent on analyzing the different modalities of conversation, including audio, visual, and others. This is a natural process for humans, but in digital libraries, where preservation and dissemination of digital information are crucial, it is a complex task. A rich conversational model, encompassing all modalities and their co-occurrences, is required to effectively analyze and interact with digital information. Currently, the analysis of co-speech gestures in videos is done through manual annotation by linguistic experts based on textual searches. However, this approach is limited and does not fully utilize the visual modality of gestures. This paper proposes a visual gesture retrieval method using a deep learning architecture to extend current research in this area. The method is based on body keypoints and uses an attention mechanism to focus on specific groups. Experiments were conducted on a subset of the NewsScape dataset, which presents challenges such as multiple people, camera perspective changes, and occlusions. A user study was conducted to assess the usability of the results, establishing a baseline for future gesture retrieval methods in real-world video collections. The results of the experiment demonstrate the high potential of the proposed method in multimodal communication research and highlight the significance of visual gesture retrieval in enhancing interaction with video content. The integration of visual similarity search for gestures in the open-source multimedia retrieval stack, vitrivr, can greatly contribute to the field of computational linguistics. This research advances the understanding of the role of the visual modality in co-speech gestures and highlights the need for further development in this area

ZORA

The Role of Email Communications in Determining Response Rates and Mode of Participation in a Mixed-mode Design

Author: Alexandru Cernat
Dillman D. A.
Dillman D. A.
Lynn P.
Lynn P.
Lynn P.
Peter Lynn
Uhrig S. N.
Uhrig S. N.
Wood M.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2017
Field of study

This article is concerned with the extent to which the propensity to participate in a web-face-to-face sequential mixed-mode survey is influenced by the ability to communicate with sample members by email in addition to mail. Researchers may be able to collect email addresses for sample members and to use them subsequently to send survey invitations and reminders. However, there is little evidence regarding the value of doing so. This makes it difficult to decide what efforts should be made to collect such information and how to subsequently use it efficiently. Using evidence from a randomized experiment within a large mixed-mode national survey, we find that using a respondent-supplied email address to send additional survey invites and reminders does not affect survey response rate but is associated with an increased proportion of responses by web rather than face to face and, hence, lower survey costs

University of Essex Research Repository

Crossref

The University of Manchester - Institutional Repository

Co-Speech Gesture Detection through Multi-phase Sequence Labeling

Author: Burenko Ilya
Fernández Raquel
Ghaleb Esam
Holler Judith
Pouw Wim
Rasenberg Marlou
Toni Ivan
Uhrig Peter
Özyürek Aslı
Publication venue
Publication date: 21/08/2023
Field of study

Gestures are integral components of face-to-face communication. They unfold over time, often following predictable movement phases of preparation, stroke, and retraction. Yet, the prevalent approach to automatic gesture detection treats the problem as binary classification, classifying a segment as either containing a gesture or not, thus failing to capture its inherently sequential and contextual nature. To address this, we introduce a novel framework that reframes the task as a multi-phase sequence labeling problem rather than binary classification. Our model processes sequences of skeletal movements over time windows, uses Transformer encoders to learn contextual embeddings, and leverages Conditional Random Fields to perform sequence labeling. We evaluate our proposal on a large dataset of diverse co-speech gestures in task-oriented face-to-face dialogues. The results consistently demonstrate that our method significantly outperforms strong baseline models in detecting gesture strokes. Furthermore, applying Transformer encoders to learn contextual embeddings from movement sequences substantially improves gesture unit detection. These results highlight our framework's capacity to capture the fine-grained dynamics of co-speech gesture phases, paving the way for more nuanced and accurate gesture detection and analysis

arXiv.org e-Print Archive

Studying time conceptualisation via speech, prosody, and hand gesture: interweaving manual and computational methods of analysis

Author: Baltazani Mary
Burenko Ilya
Burrows Evie
Dykes Nathan
Hale Scott
Pavlova Irina
Payne Elinor
Torr Philip
Uhrig Peter
Wilson Anna
Publication venue
Publication date: 15/09/2023
Field of study

This paper presents a new interdisciplinary methodology for the analysis of future conceptualisations in big messy media data. More specifically, it focuses on the depictions of post-Covid futures by RT during the pandemic, i.e. on data which are of interest not just from the perspective of academic research but also of policy engagement. The methodology has been developed to support the scaling up of fine-grained data-driven analysis of discourse utterances larger than individual lexical units which are centred around ‘will’ + the infinitive. It relies on the true integration of manual analytical and computational methods and tools in researching three modalities – textual, prosodic1, and gestural. The paper describes the process of building a computational infrastructure for the collection and processing of video data, which aims to empower the manual analysis. It also shows how manual analysis can motivate the development of computational tools. The paper presents individual computational tools to demonstrate how the combination of human and machine approaches to analysis can reveal new manifestations of cohesion between gesture and prosody. To illustrate the latter, the paper shows how the boundaries of prosodic units can work to help determine the boundaries of gestural units for future conceptualisations

Oxford University Research Archive

Analysis of continuous neuronal activity evoked by natural speech with computational corpus linguistics methods

Author: Haller Martin
Henningsen-Schomers Malte R.
Karl Valerie
Krauss Patrick
Maier Andreas
Schilling Achim
Surendra Kishore
Tomasello Rosario
Uhrig Peter
Zankl Alexandra
Publication venue: 'Informa UK Limited'
Publication date: 01/08/2021
Field of study

In the field of neurobiology of language, neuroimaging studies are generally based on stimulation paradigms consisting of at least two different conditions. Designing those paradigms can be very time-consuming and this traditional approach is necessarily data-limited. In contrast, in computational and corpus linguistics, analyses are often based on large text corpora, which allow a vast variety of hypotheses to be tested by repeatedly re-evaluating the data set. Furthermore, text corpora also allow exploratory data analysis in order to generate new hypotheses. By drawing on the advantages of both fields, neuroimaging and computational corpus linguistics, we here present a unified approach combining continuous natural speech and MEG to generate a corpus of speech-evoked neuronal activity

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen