Search CORE

1,537 research outputs found

Recommended from our members

Correlating Visual Speaker Gestures with Measures of Audience Engagement to Aid Video Browsing

Author: Zhang John
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

In this thesis, we argue that in the domains of educational lectures and political debates, speaker gestures can be a source of semantic cues for video browsing. We hypothesize that certain human gestures, which can be automatically identified through techniques of computer vision, can convey significant information that are correlated to audience engagement. We present a joint-angle descriptor derived from an automatic upper body pose estimation framework to train an SVM which identifies point and spread poses in extracted video frames of an instructor giving a lecture. Ground-truth is collected in the form of 2500 manually annotated frames covering 20 minutes of a video lecture. Cross validation on the ground-truth data showed classifier F-scores of 0.54 and 0.39 for point and spread poses, respectively. We also derive an attribute for gestures which measures the angular variance of the arm movements from this system (analogous to arm waving). We present a method for tracking hands which succeeds even when left and right hands are clasping and occluding each other. We evaluate on a ground-truth dataset of 698 images with 1301 annotated left and right hands, mostly clasped. Our method performs better than baseline on recall (0.66 vs. 0.53) without sacrificing precision (0.65 for both) toward the goal of recognizing clasped hands. For tracking, it results in an improvement over a baseline method with an F-score of 0.59 vs. 0.48. From this, we are able to derive hand motion-based gesture attributes such as velocity, direction change and extremal pose. In ground-truth studies, we manually annotate and analyze the gestures of two instructors, each in a 75-minute computer science lecture using a 14-bit pose vector. We observe "pedagogical" gestures of punctuation and encouragement in addition to traditional classes of gestures such as deictic and metaphoric. We also introduce a tool to facilitate the manual annotations of gestures in video and present results on their frequencies and co-occurrences. In particular, we find that 5 poses represent 80% of the variation in the annotated ground truth. We demonstrate a correlation between the angular variance of arm movements and the presence of those conjunctions that are used to contrast connected clauses ("but", "neither", etc.) in the accompanying speech. We do this by training an AdaBoost-based binary classifier using decision trees as weak learners. On a ground-truth database of 4243 video clips totaling 3.83 hours, each with subtitles, training on sets of conjunctions indicating contrast produces classifiers capable of achieving 55% accuracy on a balanced test set. We study two different presentation methods: an attribute graph which shows a normalized measure of the visual attributes across an entire video, as well as emphasized subtitles, where individual words are emphasized (resized) based on their accompanying gestures. Results from 12 subjects show supportive ratings given for the browsing aids in the task of providing keywords for video under time constraints. Subjects' keywords are also compared to independent ground-truth, resulting in precisions from 0.50-0.55, even when given less than half real time to view the video. We demonstrate a correlation between gesture attributes and a rigorous method of measuring audience engagement: electroencephalography (EEG). Our 20 subjects watch 61 minutes of video of the 2012 U.S. Presidential Debates while under observation through EEG. After discarding corrupted recordings, we retain 47 minutes worth of EEG data for each subject. The subjects are examined in aggregate and in subgroups according to gender and political affiliation. We find statistically significant correlations between gesture attributes (particularly extremal pose) and our feature of engagement derived from EEG. For all subjects watching all videos, we see a statistically significant correlation between gesture and engagement with a Spearman rank correlation of rho = 0.098 with p < 0.05, Bonferroni corrected. For some stratifications, correlations reach as high as rho = 0.297. From these results, we conclude what gestures can be used to measure engagement

Columbia University Academic Commons

Multimodal engagement strategies in science dissemination: A case study of TED talks and YouTube science videos

Author: Bernad-Mechó Edgar
Valeiras-Jurado Julia
Publication venue: 'SAGE Publications'
Publication date: 18/03/2023
Field of study

The growing interest on science dissemination offers new opportunities to communicate science openly to various audiences, but also brings on the challenge of adapting to an audience that does not share the same academic background. This adaptation has been referred to as recontextualization. In the case of the formats that concern this study, that is, TEDx Talks and YouTube science dissemination videos, their multimodal nature suggests that recontextualization, and therefore engagement as a crucial aspect of this process, is likely to go way beyond purely linguistic aspects. The aim of this study is to unveil how engagement strategies in two science dissemination formats (a face to face talk and an online video) are realized through complex multimodal ensembles, and to highlight differences across them. In order to fulfill this aim, two talks by the same presenter and dealing with similar content were selected for analysis: a TEDx talk and a YouTube science dissemination video from the channel PBS Space Time. The recordings were annotated using the software Multimodal Video Analysis. The annotation included engagement strategies; embodied modes, that is, modes carried out using the body; and, in the case of the YouTube video, filmic modes, that is, modes triggered by the editing process of the recorded video. Our results show that the role of both embodied and filmic modes is paramount in the realization of engagement strategies. Our findings also bring to the fore significant differences in the ways in which the two distinct audiences are engaged, concerning the frequency and use of both semiotic modes and engagement strategies

Repositori Institucional de la Universitat Jaume I

A Review on Recent Advances in Video-based Learning Research: Video Features, Interaction, Tools, and Technologies

Author: Ewerth Ralph
Hoppe Anett
Navarrete Evelyn
Publication venue: Aachen, Germany : RWTH Aachen
Publication date: 01/01/2021
Field of study

Human learning shifts stronger than ever towards online settings, and especially towards video platforms. There is an abundance of tutorials and lectures covering diverse topics, from fixing a bike to particle physics. While it is advantageous that learning resources are freely available on the Web, the quality of the resources varies a lot. Given the number of available videos, users need algorithmic support in finding helpful and entertaining learning resources. In this paper, we present a review of the recent research literature (2020-2021) on video-based learning. We focus on publications that examine the characteristics of video content, analyze frequently used features and technologies, and, finally, derive conclusions on trends and possible future research directions

Repositorium für Naturwissenschaften und Technik

A Multimodal Approach to Metadiscourse as an Organizational Tool in Lectures

Author: Bernad-Mechó Edgar
Publication venue: 'Universitat Jaume I'
Publication date: 01/01/2018
Field of study

This thesis explores the uses of organizational metadiscourse in lectures from a multimodal perspective, thus providing a holistic view of its use. Moreover, this study explores how the use of organizational metadiscourse, both at a linguistic and at non-verbal level, is influenced by the lecturing style chosen by the lecturers (conversational, rhetorical or reading styles).Esta tesis explora los usos del metadiscurso organizativo en clases universitarias desde una perspectiva multimodal que permite obtener una visión holística del mismo. Además, se describe como el discurso organizativo es influenciado tanto a nivel lingüístico como a nivel no verbal por el estilo de enseñanza del profesorado (conversacional, retórico o lector).Programa de Doctorat en Llengües Aplicades, Literatura i Traducci

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Repositori Institucional de la Universitat Jaume I

Conceitos e métodos para apoio ao desenvolvimento e avaliação de colaboração remota utilizando realidade aumentada

Author: Marques Bernardo José Santos
Publication venue
Publication date: 17/12/2021
Field of study

Remote Collaboration using Augmented Reality (AR) shows great potential to establish a common ground in physically distributed scenarios where team-members need to achieve a shared goal. However, most research efforts in this field have been devoted to experiment with the enabling technology and propose methods to support its development. As the field evolves, evaluation and characterization of the collaborative process become an essential, but difficult endeavor, to better understand the contributions of AR. In this thesis, we conducted a critical analysis to identify the main limitations and opportunities of the field, while situating its maturity and proposing a roadmap of important research actions. Next, a human-centered design methodology was adopted, involving industrial partners to probe how AR could support their needs during remote maintenance. These outcomes were combined with literature methods into an AR-prototype and its evaluation was performed through a user study. From this, it became clear the necessity to perform a deep reflection in order to better understand the dimensions that influence and must/should be considered in Collaborative AR. Hence, a conceptual model and a humancentered taxonomy were proposed to foster systematization of perspectives. Based on the model proposed, an evaluation framework for contextualized data gathering and analysis was developed, allowing support the design and performance of distributed evaluations in a more informed and complete manner. To instantiate this vision, the CAPTURE toolkit was created, providing an additional perspective based on selected dimensions of collaboration and pre-defined measurements to obtain “in situ” data about them, which can be analyzed using an integrated visualization dashboard. The toolkit successfully supported evaluations of several team-members during tasks of remote maintenance mediated by AR. Thus, showing its versatility and potential in eliciting a comprehensive characterization of the added value of AR in real-life situations, establishing itself as a generalpurpose solution, potentially applicable to a wider range of collaborative scenarios.Colaboração Remota utilizando Realidade Aumentada (RA) apresenta um enorme potencial para estabelecer um entendimento comum em cenários onde membros de uma equipa fisicamente distribuídos precisam de atingir um objetivo comum. No entanto, a maioria dos esforços de investigação tem-se focado nos aspetos tecnológicos, em fazer experiências e propor métodos para apoiar seu desenvolvimento. À medida que a área evolui, a avaliação e caracterização do processo colaborativo tornam-se um esforço essencial, mas difícil, para compreender as contribuições da RA. Nesta dissertação, realizámos uma análise crítica para identificar as principais limitações e oportunidades da área, ao mesmo tempo em que situámos a sua maturidade e propomos um mapa com direções de investigação importantes. De seguida, foi adotada uma metodologia de Design Centrado no Humano, envolvendo parceiros industriais de forma a compreender como a RA poderia responder às suas necessidades em manutenção remota. Estes resultados foram combinados com métodos da literatura num protótipo de RA e a sua avaliação foi realizada com um caso de estudo. Ficou então clara a necessidade de realizar uma reflexão profunda para melhor compreender as dimensões que influenciam e devem ser consideradas na RA Colaborativa. Foram então propostos um modelo conceptual e uma taxonomia centrada no ser humano para promover a sistematização de perspetivas. Com base no modelo proposto, foi desenvolvido um framework de avaliação para recolha e análise de dados contextualizados, permitindo apoiar o desenho e a realização de avaliações distribuídas de forma mais informada e completa. Para instanciar esta visão, o CAPTURE toolkit foi criado, fornecendo uma perspetiva adicional com base em dimensões de colaboração e medidas predefinidas para obter dados in situ, que podem ser analisados utilizando o painel de visualização integrado. O toolkit permitiu avaliar com sucesso vários colaboradores durante a realização de tarefas de manutenção remota apoiada por RA, permitindo mostrar a sua versatilidade e potencial em obter uma caracterização abrangente do valor acrescentado da RA em situações da vida real. Sendo assim, estabelece-se como uma solução genérica, potencialmente aplicável a uma gama diversificada de cenários colaborativos.Programa Doutoral em Engenharia Informátic

Repositório Institucional da Universidade de Aveiro

Metadiscourse analysis of digital interpersonal interactions in academic settings in Turkey

Author: Hatipoğlu Çiler
Publication venue
Publication date: 20/08/2019
Field of study

Rapid technological advances, efficiency and easy access have firmly established emailing as a vital medium of communication in the last decades. Nowadays, all around the world, particularly in educational settings, the medium is one of the most widely used modes of interaction between students and university lecturers. Despite their important role in academic life, very little is known about the metadiscursive characteristics of these e-messages and as far as the author is aware there is no study that has examined metadiscourse in request emails in Turkish. This study aims to contribute to filling in this gap by focusing on the following two research questions: (i) How many and what type of interpersonal metadiscourse markers are used in request emails sent by students to their lecturers? (ii) Where are they placed and how are they combined with other elements in the text? In order to answer these questions a corpus of unsolicited request e-mails in Turkish was compiled. The data collection started in January 2010 and continued until March 2018. A total of 353 request emails sent from university students to their lecturers were collected. The data were first transcribed in CLAN CHILDES format and analysed using the interpersonal model. The metadiscourse categories that aimed to involve readers in the email were identified and classified. Next, their places in the text were determined and described in detail. Findings of the study show that request emails include a wide array of multifunctional interpersonal metadiscourse markers which are intricately combined and employed by the writers to reach their aims. The results also showed that there is a close relation between the “weight of the request” and number of the interpersonal metadiscourse markers in request mails

OpenMETU (Middle East Technical University)

Action Recognition in Videos: from Motion Capture Labs to the Web

Author: Ana Paula Br
Arnaldo Albuquerque De Araújo
De Almeida
Eduardo Alves
Jussara Marques
Publication venue
Publication date: 17/06/2010
Field of study

This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX

Communication and Automatic Interpretation of Affect from Facial Expressions

Author: Gevers T.
Salah A.A.
Sebe N.
Publication venue: Information Science Reference
Publication date: 01/01/2010
Field of study

International Migration, Integration and Social Cohesion online publications

Switching Partners: Dancing with the Ontological Engineers

Author: Ceusters Werner
Smith Barry
Publication venue
Publication date: 01/01/2011
Field of study

Ontologies are today being applied in almost every field to support the alignment and retrieval of data of distributed provenance. Here we focus on new ontological work on dance and on related cultural phenomena belonging to what UNESCO calls the “intangible heritage.” Currently data and information about dance, including video data, are stored in an uncontrolled variety of ad hoc ways. This serves not only to prevent retrieval, comparison and analysis of the data, but may also impinge on our ability to preserve the data that already exists. Here we explore recent technological developments that are designed to counteract such problems by allowing information to be retrieved across disciplinary, cultural, linguistic and technological boundaries. Software applications such as the ones envisaged here will enable speedier recovery of data and facilitate its analysis in ways that will assist both archiving of and research on dance

PhilPapers