7 research outputs found

    The State of Speech in HCI: Trends, Themes and Challenges

    Get PDF

    It's Good to Talk: A Comparison of Using Voice Versus Screen-Based Interactions for Agent-Assisted Tasks

    Get PDF
    Voice assistants have become hugely popular in the home as domestic and entertainment devices. Recently, there has been a move towards developing them for work settings. For example, Alexa for Business and IBM Watson for Business were designed to improve productivity, by assisting with various tasks, such as scheduling meetings and taking minutes. However, this kind of assistance is largely limited to planning and managing user's work. How might they be developed to do more by way of empowering people at work? Our research is concerned with achieving this by developing an agent with the role of a facilitator that assists users during an ongoing task. Specifically, we were interested in whether the modality in which the agent interacts with users makes a difference: How does a voice versus screen-based agent interaction affect user behavior? We hypothesized that voice would be more immediate and emotive, resulting in more fluid conversations and interactions. Here, we describe a user study that compared the benefits of using voice versus screen-based interactions when interacting with a system incorporating an agent, involving pairs of participants doing an exploratory data analysis task that required them to make sense of a series of data visualizations. The findings from the study show marked differences between the two conditions, with voice resulting in more turn-taking in discussions, questions asked, more interactions with the system and a tendency towards more immediate, faster-paced discussions following agent prompts. We discuss the possible reasons for why talking and being prompted by a voice assistant may be preferable and more effective at mediating human-human conversations and we translate some of the key insights of this research into design implications

    The Role of Spoken Feedback in Experiencing Multimodal Interfaces as Human-like

    Get PDF
    If user interfaces should be made human-like vs. tool-like has been debated in the HCI field, and this debate affects the development of multimodal interfaces. However, little empirical study has been done to support either view so far. Even if there is evidence that humans interpret media as other humans, this does not mean that humans experience the interfaces as human-like. We studied how people experience a multimodal timetable system with varying degree of human-like spoken feedback in a Wizardof-Oz study. The results showed that users ’ views and preferences lean significantly towards anthropomorphism after actually experiencing the multimodal timetable system. The more humanlike the spoken feedback is the more participants preferred the system to be human-like. The results also showed that the users experience matched their preferences. This shows that in order to appreciate a human-like interface, the users have to experience it

    Answering questions about archived, annotated meetings

    Get PDF
    Retrieving information from archived meetings is a new domain of information retrieval that has received increasing attention in the past few years. Search in spontaneous spoken conversations has been recognized as more difficult than text-based document retrieval because meeting discussions contain two levels of information: the content itself, i.e. what topics are discussed, but also the argumentation process, i.e. what conflicts are resolved and what decisions are made. To capture the richness of information in meetings, current research focuses on recording meetings in Smart-Rooms, transcribing meeting discussion into text and annotating discussion with semantic higher-level structures to allow for efficient access to the data. However, it is not yet clear what type of user interface is best suited for searching and browsing such archived, annotated meetings. Content-based retrieval with keyword search is too naive and does not take into account the semantic annotations on the data. The objective of this thesis is to assess the feasibility and usefulness of a natural language interface to meeting archives that allows users to ask complex questions about meetings and retrieve episodes of meeting discussions based on semantic annotations. The particular issues that we address are: the need of argumentative annotation to answer questions about meetings; the linguistic and domain-specific natural language understanding techniques required to interpret such questions; and the use of visual overviews of meeting annotations to guide users in formulating questions. To meet the outlined objectives, we have annotated meetings with argumentative structure and built a prototype of a natural language understanding engine that interprets questions based on those annotations. Further, we have performed two sets of user experiments to study what questions users ask when faced with a natural language interface to annotated meeting archives. For this, we used a simulation method called Wizard of Oz, to enable users to express questions in their own terms without being influenced by limitations in speech recognition technology. Our experimental results show that technically it is feasible to annotate meetings and implement a deep-linguistic NLU engine for questions about meetings, but in practice users do not consistently take advantage of these features. Instead they often search for keywords in meetings. When visual overviews of the available annotations are provided, users refer to those annotations in their questions, but the complexity of questions remains simple. Users search with a breadth-first approach, asking questions in sequence instead of a single complex question. We conclude that natural language interfaces to meeting archives are useful, but that more experimental work is needed to find ways to incent users to take advantage of the expressive power of natural language when asking questions about meetings

    Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles

    Get PDF
    Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners. This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)
    corecore