2,902 research outputs found

    Multimodal information presentation for high-load human computer interaction

    Get PDF
    This dissertation addresses the question: given an application and an interaction context, how can interfaces present information to users in a way that improves the quality of interaction (e.g. a better user performance, a lower cognitive demand and a greater user satisfaction)? Information presentation is critical to the quality of interaction because it guides, constrains and even determines cognitive behavior. A good presentation is particularly desired in high-load human computer interactions, such as when users are under time pressure, stress, or are multi-tasking. Under a high mental workload, users may not have the spared cognitive capacity to cope with the unnecessary workload induced by a bad presentation. In this dissertation work, the major presentation factor of interest is modality. We have conducted theoretical studies in the cognitive psychology domain, in order to understand the role of presentation modality in different stages of human information processing. Based on the theoretical guidance, we have conducted a series of user studies investigating the effect of information presentation (modality and other factors) in several high-load task settings. The two task domains are crisis management and driving. Using crisis scenario, we investigated how to presentation information to facilitate time-limited visual search and time-limited decision making. In the driving domain, we investigated how to present highly-urgent danger warnings and how to present informative cues that help drivers manage their attention between multiple tasks. The outcomes of this dissertation work have useful implications to the design of cognitively-compatible user interfaces, and are not limited to high-load applications

    MOG 2010:3rd Workshop on Multimodal Output Generation: Proceedings

    Get PDF

    Multimodal Speech Emotion Recognition Using Audio and Text

    Full text link
    Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers. In this paper, we propose a novel deep dual recurrent encoder model that utilizes text data and audio signals simultaneously to obtain a better understanding of speech data. As emotional dialogue is composed of sound and spoken content, our model encodes the information from audio and text sequences using dual recurrent neural networks (RNNs) and then combines the information from these sources to predict the emotion class. This architecture analyzes speech data from the signal level to the language level, and it thus utilizes the information within the data more comprehensively than models that focus on audio features. Extensive experiments are conducted to investigate the efficacy and properties of the proposed model. Our proposed model outperforms previous state-of-the-art methods in assigning data to one of four emotion categories (i.e., angry, happy, sad and neutral) when the model is applied to the IEMOCAP dataset, as reflected by accuracies ranging from 68.8% to 71.8%.Comment: 7 pages, Accepted as a conference paper at IEEE SLT 201

    Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system

    Get PDF
    This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system

    Healing conversations: Developing a practical framework for clinical communication between Aboriginal communities and healthcare practitioners

    Get PDF
    In recognition of the ongoing health disparities experienced by Aboriginal and Torres Strait Islander peoples (hereafter Aboriginal), this scoping review explores the role and impact of the clinical communication process on Aboriginal healthcare provision. A medical education lens is applied, looking at the utility of a tailored clinical communication framework to assist health practitioners work more effectively with Aboriginal peoples and communities. The initial framework, building on existing communication guides, proposes four domains: content, process, relational and environmental. It places emphasis on critical self-reflection of the health practitioner’s own cultural identity and will be guided by collective Aboriginal world-views in select Australian settings. Using a two-eyed seeing approach the framework will be developed and tested in health professional education. The aim of this research journey is to enable health practitioners to have more effective healthcare conversations with Aboriginal peoples, working toward more socially just and equitable healthcare interactions and outcome

    What does semantic tiling of the cortex tell us about semantics?

    Get PDF
    Recent use of voxel-wise modeling in cognitive neuroscience suggests that semantic maps tile the cortex. Although this impressive research establishes distributed cortical areas active during the conceptual processing that underlies semantics, it tells us little about the nature of this processing. While mapping concepts between Marr's computational and implementation levels to support neural encoding and decoding, this approach ignores Marr's algorithmic level, central for understanding the mechanisms that implement cognition, in general, and conceptual processing, in particular. Following decades of research in cognitive science and neuroscience, what do we know so far about the representation and processing mechanisms that implement conceptual abilities? Most basically, much is known about the mechanisms associated with: (1) features and frame representations, (2) grounded, abstract, and linguistic representations, (3) knowledge-based inference, (4) concept composition, and (5) conceptual flexibility. Rather than explaining these fundamental representation and processing mechanisms, semantic tiles simply provide a trace of their activity over a relatively short time period within a specific learning context. Establishing the mechanisms that implement conceptual processing in the brain will require more than mapping it to cortical (and sub-cortical) activity, with process models from cognitive science likely to play central roles in specifying the intervening mechanisms. More generally, neuroscience will not achieve its basic goals until it establishes algorithmic-level mechanisms that contribute essential explanations to how the brain works, going beyond simply establishing the brain areas that respond to various task conditions

    Compositional Meaning Represented in Indonesian EFL Learners PowerPoint Slides: A Multimodal Viewpoint

    Get PDF
    A proper layout is one of the essential fragments of PowerPoint’s slides. Due to it is one of instruments that needed to display graphs, thin data-density, chartjunk, encoded legends, etc. Therefore, to achieve ideal PowerPoint’s layout, the analysis of compositional meaning is necessary. Compositional meaning consists of information values, salience, framing, based on theoretical frame work which is in accordance with the applicable provisions of color selection, layout, size, positioning must be relevant. The method of this research is qualitative descriptive analysis technique by using the concept of Kress and van Leeuwen (2006) focusing on compositional meaning represented in Indonesian EFL learners PowerPoint slides. The result revealed that the designers of the PowerPoint used the bright color that it clearly differ subheading of the layout. Additionally, the text was positioned properly by placing the priority text on the top part of slide then the less important text was placed in the bottom part. Unfortunately, the designers made the text tone were almost the same tone with the background of the slides. Through the analysis of compositional meaning, it can be used as the rules for students in making PowerPoint
    corecore