1,981 research outputs found

    SpeechMirror: A Multimodal Visual Analytics System for Personalized Reflection of Online Public Speaking Effectiveness

    Full text link
    As communications are increasingly taking place virtually, the ability to present well online is becoming an indispensable skill. Online speakers are facing unique challenges in engaging with remote audiences. However, there has been a lack of evidence-based analytical systems for people to comprehensively evaluate online speeches and further discover possibilities for improvement. This paper introduces SpeechMirror, a visual analytics system facilitating reflection on a speech based on insights from a collection of online speeches. The system estimates the impact of different speech techniques on effectiveness and applies them to a speech to give users awareness of the performance of speech techniques. A similarity recommendation approach based on speech factors or script content supports guided exploration to expand knowledge of presentation evidence and accelerate the discovery of speech delivery possibilities. SpeechMirror provides intuitive visualizations and interactions for users to understand speech factors. Among them, SpeechTwin, a novel multimodal visual summary of speech, supports rapid understanding of critical speech factors and comparison of different speech samples, and SpeechPlayer augments the speech video by integrating visualization of the speaker's body language with interaction, for focused analysis. The system utilizes visualizations suited to the distinct nature of different speech factors for user comprehension. The proposed system and visualization techniques were evaluated with domain experts and amateurs, demonstrating usability for users with low visualization literacy and its efficacy in assisting users to develop insights for potential improvement.Comment: Main paper (11 pages, 6 figures) and Supplemental document (11 pages, 11 figures). Accepted by VIS 202

    A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

    Full text link
    People increasingly use videos on the Web as a source for learning. To support this way of learning, researchers and developers are continuously developing tools, proposing guidelines, analyzing data, and conducting experiments. However, it is still not clear what characteristics a video should have to be an effective learning medium. In this paper, we present a comprehensive review of 257 articles on video-based learning for the period from 2016 to 2021. One of the aims of the review is to identify the video characteristics that have been explored by previous work. Based on our analysis, we suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: (1) audio features, (2) visual features, (3) textual features, (4) instructor behavior, (5) learners activities, (6) interactive features (quizzes, etc.), (7) production style, and (8) instructional design. Also, we identify four representative research directions: (1) proposals of tools to support video-based learning, (2) studies with controlled experiments, (3) data analysis studies, and (4) proposals of design guidelines for learning videos. We find that the most explored characteristics are textual features followed by visual features, learner activities, and interactive features. Text of transcripts, video frames, and images (figures and illustrations) are most frequently used by tools that support learning through videos. The learner activity is heavily explored through log files in data analysis studies, and interactive features have been frequently scrutinized in controlled experiments. We complement our review by contrasting research findings that investigate the impact of video characteristics on the learning effectiveness, report on tasks and technologies used to develop tools that support learning, and summarize trends of design guidelines to produce learning video

    Advanced Speaking

    Get PDF

    Utilization of multimodal interaction signals for automatic summarisation of academic presentations

    Get PDF
    Multimedia archives are expanding rapidly. For these, there exists a shortage of retrieval and summarisation techniques for accessing and browsing content where the main information exists in the audio stream. This thesis describes an investigation into the development of novel feature extraction and summarisation techniques for audio-visual recordings of academic presentations. We report on the development of a multimodal dataset of academic presentations. This dataset is labelled by human annotators to the concepts of presentation ratings, audience engagement levels, speaker emphasis, and audience comprehension. We investigate the automatic classification of speaker ratings and audience engagement by extracting audio-visual features from video of the presenter and audience and training classifiers to predict speaker ratings and engagement levels. Following this, we investigate automatic identiļæ½cation of areas of emphasised speech. By analysing all human annotated areas of emphasised speech, minimum speech pitch and gesticulation are identified as indicating emphasised speech when occurring together. Investigations are conducted into the speaker's potential to be comprehended by the audience. Following crowdsourced annotation of comprehension levels during academic presentations, a set of audio-visual features considered most likely to affect comprehension levels are extracted. Classifiers are trained on these features and comprehension levels could be predicted over a 7-class scale to an accuracy of 49%, and over a binary distribution to an accuracy of 85%. Presentation summaries are built by segmenting speech transcripts into phrases, and using keywords extracted from the transcripts in conjunction with extracted paralinguistic features. Highest ranking segments are then extracted to build presentation summaries. Summaries are evaluated by performing eye-tracking experiments as participants watch presentation videos. Participants were found to be consistently more engaged for presentation summaries than for full presentations. Summaries were also found to contain a higher concentration of new information than full presentations

    Assistive technologies for severe and profound hearing loss: beyond hearing aids and implants

    Get PDF
    Assistive technologies offer capabilities that were previously inaccessible to individuals with severe and profound hearing loss who have no or limited access to hearing aids and implants. This literature review aims to explore existing assistive technologies and identify what still needs to be done. It is found that there is a lack of focus on the overall objectives of assistive technologies. In addition, several other issues are identified i.e. only a very small number of assistive technologies developed within a research context have led to commercial devices, there is a predisposition to use the latest expensive technologies and a tendency to avoid designing products universally. Finally, the further development of plug-ins that translate the text content of a website to various sign languages is needed to make information on the internet more accessible

    Tutor In-sight: Guiding and Visualizing Students Attention with Mixed Reality Avatar Presentation Tools

    Get PDF
    Remote conferencing systems are increasingly used to supplement or even replace in-person teaching. However, prevailing conferencing systems restrict the teacherā€™s representation to a webcam live-stream, hamper the teacherā€™s use of body-language, and result in studentsā€™ decreased sense of co-presence and participation. While Virtual Reality (VR) systems may increase student engagement, the teacher may not have the time or expertise to conduct the lecture in VR. To address this issue and bridge the requirements between students and teachers, we have developed Tutor In-sight, a Mixed Reality (MR) avatar augmented into the studentā€™s workspace based on four design requirements derived from the existing literature, namely: integrated virtual with physical space, improved teacherā€™s co-presence through avatar, direct attention with auto-generated body language, and usable workfow for teachers. Two user studies were conducted from the perspectives of students and teachers to determine the advantages of Tutor In-sight in comparison to two existing conferencing systems, Zoom (video-based) and Mozilla Hubs (VR-based). The participants of both studies favoured Tutor In-sight. Among others, this main fnding indicates that Tutor Insight satisfed the needs of both teachers and students. In addition, the participantsā€™ feedback was used to empirically determine the four main teacher requirements and the four main student requirements in order to improve the future design of MR educational tools

    Psychophysiological analysis of a pedagogical agent and robotic peer for individuals with autism spectrum disorders.

    Get PDF
    Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by ongoing problems in social interaction and communication, and engagement in repetitive behaviors. According to Centers for Disease Control and Prevention, an estimated 1 in 68 children in the United States has ASD. Mounting evidence shows that many of these individuals display an interest in social interaction with computers and robots and, in general, feel comfortable spending time in such environments. It is known that the subtlety and unpredictability of peopleā€™s social behavior are intimidating and confusing for many individuals with ASD. Computerized learning environments and robots, however, prepare a predictable, dependable, and less complicated environment, where the interaction complexity can be adjusted so as to account for these individualsā€™ needs. The first phase of this dissertation presents an artificial-intelligence-based tutoring system which uses an interactive computer character as a pedagogical agent (PA) that simulates a human tutor teaching sight word reading to individuals with ASD. This phase examines the efficacy of an instructional package comprised of an autonomous pedagogical agent, automatic speech recognition, and an evidence-based instructional procedure referred to as constant time delay (CTD). A concurrent multiple-baseline across-participants design is used to evaluate the efficacy of intervention. Additionally, post-treatment probes are conducted to assess maintenance and generalization. The results suggest that all three participants acquired and maintained new sight words and demonstrated generalized responding. The second phase of this dissertation describes the augmentation of the tutoring system developed in the first phase with an autonomous humanoid robot which serves the instructional role of a peer for the student. In this tutoring paradigm, the robot adopts a peer metaphor, where its function is to act as a peer. With the introduction of the robotic peer (RP), the traditional dyadic interaction in tutoring systems is augmented to a novel triadic interaction in order to enhance the social richness of the tutoring system, and to facilitate learning through peer observation. This phase evaluates the feasibility and effects of using PA-delivered sight word instruction, based on a CTD procedure, within a small-group arrangement including a student with ASD and the robotic peer. A multiple-probe design across word sets, replicated across three participants, is used to evaluate the efficacy of intervention. The findings illustrate that all three participants acquired, maintained, and generalized all the words targeted for instruction. Furthermore, they learned a high percentage (94.44% on average) of the non-target words exclusively instructed to the RP. The data show that not only did the participants learn nontargeted words by observing the instruction to the RP but they also acquired their target words more efficiently and with less errors by the addition of an observational component to the direct instruction. The third and fourth phases of this dissertation focus on physiology-based modeling of the participantsā€™ affective experiences during naturalistic interaction with the developed tutoring system. While computers and robots have begun to co-exist with humans and cooperatively share various tasks; they are still deficient in interpreting and responding to humans as emotional beings. Wearable biosensors that can be used for computerized emotion recognition offer great potential for addressing this issue. The third phase presents a Bluetooth-enabled eyewear ā€“ EmotiGO ā€“ for unobtrusive acquisition of a set of physiological signals, i.e., skin conductivity, photoplethysmography, and skin temperature, which can be used as autonomic readouts of emotions. EmotiGO is unobtrusive and sufficiently lightweight to be worn comfortably without interfering with the usersā€™ usual activities. This phase presents the architecture of the device and results from testing that verify its effectiveness against an FDA-approved system for physiological measurement. The fourth and final phase attempts to model the studentsā€™ engagement levels using their physiological signals collected with EmotiGO during naturalistic interaction with the tutoring system developed in the second phase. Several physiological indices are extracted from each of the signals. The studentsā€™ engagement levels during the interaction with the tutoring system are rated by two trained coders using the video recordings of the instructional sessions. Supervised pattern recognition algorithms are subsequently used to map the physiological indices to the engagement scores. The results indicate that the trained models are successful at classifying participantsā€™ engagement levels with the mean classification accuracy of 86.50%. These models are an important step toward an intelligent tutoring system that can dynamically adapt its pedagogical strategies to the affective needs of learners with ASD

    Human-Computer Interaction

    Get PDF
    In this book the reader will find a collection of 31 papers presenting different facets of Human Computer Interaction, the result of research projects and experiments as well as new approaches to design user interfaces. The book is organized according to the following main topics in a sequential order: new interaction paradigms, multimodality, usability studies on several interaction mechanisms, human factors, universal design and development methodologies and tools
    • ā€¦
    corecore