Search CORE

13,480 research outputs found

It's Good to Talk: A Comparison of Using Voice Versus Screen-Based Interactions for Agent-Assisted Tasks

Author: Capra L
Duong TD
Reicherts L
Rogers Y
Sebire N
Wood E
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2022
Field of study

Voice assistants have become hugely popular in the home as domestic and entertainment devices. Recently, there has been a move towards developing them for work settings. For example, Alexa for Business and IBM Watson for Business were designed to improve productivity, by assisting with various tasks, such as scheduling meetings and taking minutes. However, this kind of assistance is largely limited to planning and managing user's work. How might they be developed to do more by way of empowering people at work? Our research is concerned with achieving this by developing an agent with the role of a facilitator that assists users during an ongoing task. Specifically, we were interested in whether the modality in which the agent interacts with users makes a difference: How does a voice versus screen-based agent interaction affect user behavior? We hypothesized that voice would be more immediate and emotive, resulting in more fluid conversations and interactions. Here, we describe a user study that compared the benefits of using voice versus screen-based interactions when interacting with a system incorporating an agent, involving pairs of participants doing an exploratory data analysis task that required them to make sense of a series of data visualizations. The findings from the study show marked differences between the two conditions, with voice resulting in more turn-taking in discussions, questions asked, more interactions with the system and a tendency towards more immediate, faster-paced discussions following agent prompts. We discuss the possible reasons for why talking and being prompted by a voice assistant may be preferable and more effective at mediating human-human conversations and we translate some of the key insights of this research into design implications

UCL Discovery

Visualizations for an Explainable Planning Agent

Author: Bellamy Rachel K. E.
Chakraborti Tathagata
Dholakia Mishal
Fadnis Kshitij P.
Kephart Jeffrey O.
Srivastava Biplav
Talamadupula Kartik
Publication venue
Publication date: 08/02/2018
Field of study

In this paper, we report on the visualization capabilities of an Explainable AI Planning (XAIP) agent that can support human in the loop decision making. Imposing transparency and explainability requirements on such agents is especially important in order to establish trust and common ground with the end-to-end automated planning system. Visualizing the agent's internal decision-making processes is a crucial step towards achieving this. This may include externalizing the "brain" of the agent -- starting from its sensory inputs, to progressively higher order decisions made by it in order to drive its planning components. We also show how the planner can bootstrap on the latest techniques in explainable planning to cast plan visualization as a plan explanation problem, and thus provide concise model-based visualization of its plans. We demonstrate these functionalities in the context of the automated planning components of a smart assistant in an instrumented meeting space.Comment: PREVIOUSLY Mr. Jones -- Towards a Proactive Smart Room Orchestrator (appeared in AAAI 2017 Fall Symposium on Human-Agent Groups

arXiv.org e-Print Archive

Crossref

Customization of IBM Intu’s Voice by Connecting Text-to-Speech Services and a Voice Conversion Network

Author: Choi Euishin
Kim Hyunjae
Kim Minseok
Lee Jaekoo
Song Jongyoon
Yoon Sungroh
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2018
Field of study

IBM has recently launched Project Intu, which extends the existing web-based cognitive service Watson with the Internet of Things to provide an intelligent personal assistant service. We propose a voice customization service that allows a user to directly customize the voice of Intu. The method for voice customization is based on IBM Watson’s text-to-speech service and voice conversion model. A user can train the voice conversion model by providing a minimum of approximately 100 speech samples in the preferred voice (target voice). The output voice of Intu (source voice) is then converted into the target voice. Furthermore, the user does not need to offer parallel data for the target voice since the transcriptions of the source speech and target speech are the same. We also suggest methods to maximize the efficiency of voice conversion and determine the proper amount of target speech based on several experiments. When we measured the elapsed time for each process, we observed that feature extraction accounts for 59.7% of voice conversion time, which implies that fixing inefficiencies in feature extraction should be prioritized. We used the mel-cepstral distortion between the target speech and reconstructed speech as an index for conversion accuracy and found that, when the number of target speech samples for training is less than 100, the general performance of the model degrades

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Ensemble decision making in real-time games

Author: Anderson Damien
Levine John
Rodgers Philip
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/10/2018
Field of study

Crossref

ResearchOnline@GCU

CORK: A COnversational agent framewoRK exploiting both rational and emotional intelligence

Author: Davide Fisicaro
Fabio Catania
Franca Garzotto
Micol Spitale
Publication venue
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano