131,877 research outputs found

    MUG: Interactive Multimodal Grounding on User Interfaces

    Full text link
    We present MUG, a novel interactive task for multimodal grounding where a user and an agent work collaboratively on an interface screen. Prior works modeled multimodal UI grounding in one round: the user gives a command and the agent responds to the command. Yet, in a realistic scenario, a user command can be ambiguous when the target action is inherently difficult to articulate in natural language. MUG allows multiple rounds of interactions such that upon seeing the agent responses, the user can give further commands for the agent to refine or even correct its actions. Such interaction is critical for improving grounding performances in real-world use cases. To investigate the problem, we create a new dataset that consists of 77,820 sequences of human user-agent interaction on mobile interfaces in which 20% involves multiple rounds of interactions. To establish our benchmark, we experiment with a range of modeling variants and evaluation strategies, including both offline and online evaluation-the online strategy consists of both human evaluation and automatic with simulators. Our experiments show that allowing iterative interaction significantly improves the absolute task completion by 18% over the entire test dataset and 31% over the challenging subset. Our results lay the foundation for further investigation of the problem

    CAMMD: Context Aware Mobile Medical Devices

    Get PDF
    Telemedicine applications on a medical practitioners mobile device should be context-aware. This can vastly improve the effectiveness of mobile applications and is a step towards realising the vision of a ubiquitous telemedicine environment. The nomadic nature of a medical practitioner emphasises location, activity and time as key context-aware elements. An intelligent middleware is needed to effectively interpret and exploit these contextual elements. This paper proposes an agent-based architectural solution called Context-Aware Mobile Medical Devices (CAMMD). This framework can proactively communicate patient records to a portable device based upon the active context of its medical practitioner. An expert system is utilised to cross-reference the context-aware data of location and time against a practitioners work schedule. This proactive distribution of medical data enhances the usability and portability of mobile medical devices. The proposed methodology alleviates constraints on memory storage and enhances user interaction with the handheld device. The framework also improves utilisation of network bandwidth resources. An experimental prototype is presented highlighting the potential of this approach

    A fully-distributed, multiagent approach to negotiation in mobile ad-hoc networks

    Get PDF
    This paper presents an interaction protocol intended to be used in distributed negotiation problems using software agents,\ud which could be applied to multi-agent systems deployed over Personal Digital Assistants (PDAs) connected via wireless\ud networks. We are especially interested in semi-competitive scenarios, where each agent in the system acts on behalf of a\ud user, trying to maximize its user preferences while pursuing a common agreement. In these conditions, and especially if\ud we are dealing with open and dynamic environments like mobile ad-hoc networks, the goals and attitudes of software\ud agents cannot be guaranteed. Taking this into account we propose a protocol where interaction among agents is done in a\ud fully-distributed manner, so that no user can have negotiation privileges over the others

    Elckerlyc goes mobile - Enabling natural interaction in mobile user interfaces

    Get PDF
    The fast growth of computational resources and speech technology available on mobile devices makes it possible to entertain users of these devices in having a natural dialogue with service systems. These systems are sometimes perceived as social agents and this can be supported by presenting them on the interface by means of an animated embodied conversational agent. To take the full advantage of the power of embodied conversational agents in service systems it is important to support real-time, online and responsive interaction with the system through the embodied conversational agent. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the specification and animation of synchronised multi-modal responsive animated agents. This paper presents a new light-weight PictureEngine that allows to run this platform in mobile applications. We describe the integration of the PictureEngine in the user interface of two different coaching applications and discuss the findings from user evaluations. We also conducted a study to evaluate an editing tool for the specification of the agent’s communicative behaviour. Twenty one participants had to specify the behaviour of an embodied conversational agent using the PictureEngine. We may conclude that this new lightweight back-end engine for the Elckerlyc platform makes it easier to build embodied conversational interfaces for mobile devices

    Landmarks: Navigating Spacetime and Digital Mobility

    Get PDF
    In this essay we will examine how we can conceptualize digital mobility as spatial navigation. Digital mobility occurs in media where the user navigates through space and actually becomes, simultaneously, creator, performer, and navigator of a spatial story. In this sense, the on-screen navigator simultaneously makes and reads space. We argue that in digital mobilities the user/player becomes simultaneously I-narrator, actor and agent of narrative. The user navigates through space and becomes, in fact, a digital pedestrian. Different from the (virtual) mobility of analogue moving-image media in that the interaction between user and space is much more fluid and the user becomes both actor and navigator, digital mobility is clearly central to the use of mobile screens, such as mobile phones, navigation devices, or portable game consoles in which case one carries the screen and interacts with it, while being on the move. Moreover, we also believe that digital mobility can be a central quality of certain digital practices during which users are not literally on the move but still have to navigate through, and control digital environments through spatial interaction. This can for example be the case when playing certain games or consulting Google Earth on a desktop computer

    Elckerlyc goes mobile - Enabling natural interaction in mobile user interfaces

    Get PDF
    The fast growth of computational resources and speech technology available on mobile devices makes it possible to entertain users of these devices in having a natural dialogue with service systems. These systems are sometimes perceived as social agents and this can be supported by presenting them on the interface by means of an animated embodied conversational agent. To take the full advantage of the power of embodied conversational agents in service systems it is important to support real-time, online and responsive interaction with the system through the embodied conversational agent. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the speci﬿cation and animation of synchronised multi-modal responsive animated agents. This paper presents a new light-weight PictureEngine that allows to run this platform in mobile applications. We describe the integration of the PictureEngine in the user interface of two different coaching applications and discuss the ﬿ndings from user evaluations. We also conducted a study to evaluate an editing tool for the speci﬿cation of the agent’s communicative behaviour. Twenty one participants had to specify the behaviour of an embodied conversational agent using the PictureEngine. We may conclude that this new lightweight back-end engine for the Elckerlyc platform makes it easier to build embodied conversational interfaces for mobile devices

    Elckerlyc goes mobile: enabling technology for ECAs in mobile applications

    Get PDF
    The fast growth of computational resources and speech technology available on mobile devices makes it pos- sible for users of these devices to interact with service sys- tems through natural dialogue. These systems are sometimes perceived as social agents and presented by means of an animated embodied conversational agent (ECA). To take the full advantage of the power of ECAs in service systems, it is important to support real-time, online and responsive interaction with the system through the ECA. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the specification and animation of synchronised multimodal responsive animated agents. This paper presents a new light-weight PictureEngine that allows this platform to embed an ECA in the user interface of mobile applications. The ECA can be specified by using the behavior markup language (BML). An application and user evaluations of Elckerlyc and the PictureEngine in a mobile embedded digital coach is presented

    Enabling Conversational Interaction with Mobile UI using Large Language Models

    Full text link
    Conversational agents show the promise to allow users to interact with mobile devices using language. However, to perform diverse UI tasks with natural language, developers typically need to create separate datasets and models for each specific task, which is expensive and effort-consuming. Recently, pre-trained large language models (LLMs) have been shown capable of generalizing to various downstream tasks when prompted with a handful of examples from the target task. This paper investigates the feasibility of enabling versatile conversational interactions with mobile UIs using a single LLM. We propose a design space to categorize conversations between the user and the agent when collaboratively accomplishing mobile tasks. We design prompting techniques to adapt an LLM to conversational tasks on mobile UIs. The experiments show that our approach enables various conversational interactions with decent performances, manifesting its feasibility. We discuss the use cases of our work and its implications for language-based mobile interaction

    DESIGN FOR FAST REQUEST FULFILLMENT OR NATURAL INTERACTION? INSIGHTS FROM AN EXPERIMENT WITH A CONVERSATIONAL AGENT

    Get PDF
    Conversational agents continue to permeate our lives in different forms, such as virtual assistants on mobile devices or chatbots on websites and social media. The interaction with users through natural language offers various aspects for researchers to study as well as application domains for practitioners to explore. In particular their design represents an interesting phenomenon to investigate as humans show social responses to these agents and successful design remains a challenge in practice. Compared to digital human-to-human communication, text-based conversational agents can provide complementary, preset answer options with which users can conveniently and quickly respond in the interaction. However, their use might also decrease the perceived humanness and social presence of the agent as the user does not respond naturally by thinking of and formulating a reply. In this study, we conducted an experiment with N=80 participants in a customer service context to explore the impact of such elements on agent anthropomorphism and user satisfaction. The results show that their use reduces perceived humanness and social presence yet does not significantly increase service satisfaction. On the contrary, our findings indicate that preset answer options might even be detrimental to service satisfaction as they diminish the natural feel of human-CA interaction
    corecore