131,877 research outputs found
MUG: Interactive Multimodal Grounding on User Interfaces
We present MUG, a novel interactive task for multimodal grounding where a
user and an agent work collaboratively on an interface screen. Prior works
modeled multimodal UI grounding in one round: the user gives a command and the
agent responds to the command. Yet, in a realistic scenario, a user command can
be ambiguous when the target action is inherently difficult to articulate in
natural language. MUG allows multiple rounds of interactions such that upon
seeing the agent responses, the user can give further commands for the agent to
refine or even correct its actions. Such interaction is critical for improving
grounding performances in real-world use cases. To investigate the problem, we
create a new dataset that consists of 77,820 sequences of human user-agent
interaction on mobile interfaces in which 20% involves multiple rounds of
interactions. To establish our benchmark, we experiment with a range of
modeling variants and evaluation strategies, including both offline and online
evaluation-the online strategy consists of both human evaluation and automatic
with simulators. Our experiments show that allowing iterative interaction
significantly improves the absolute task completion by 18% over the entire test
dataset and 31% over the challenging subset. Our results lay the foundation for
further investigation of the problem
CAMMD: Context Aware Mobile Medical Devices
Telemedicine applications on a medical practitioners mobile device should be context-aware. This can vastly improve the effectiveness of mobile applications and is a step towards realising the vision of a ubiquitous telemedicine environment. The nomadic nature of a medical practitioner emphasises location, activity and time as key context-aware elements. An intelligent middleware is needed to effectively interpret and exploit these contextual elements. This paper proposes an agent-based architectural solution called Context-Aware Mobile Medical Devices (CAMMD). This framework can proactively communicate patient records to a portable device based upon the active context of its medical practitioner. An expert system is utilised to cross-reference the context-aware data of location and time against a practitioners work schedule. This proactive distribution of medical data enhances the usability and portability of mobile medical devices. The proposed methodology alleviates constraints on memory storage and enhances user interaction with the handheld device. The framework also improves utilisation of network bandwidth resources. An experimental prototype is presented highlighting the potential of this approach
A fully-distributed, multiagent approach to negotiation in mobile ad-hoc networks
This paper presents an interaction protocol intended to be used in distributed negotiation problems using software agents,\ud
which could be applied to multi-agent systems deployed over Personal Digital Assistants (PDAs) connected via wireless\ud
networks. We are especially interested in semi-competitive scenarios, where each agent in the system acts on behalf of a\ud
user, trying to maximize its user preferences while pursuing a common agreement. In these conditions, and especially if\ud
we are dealing with open and dynamic environments like mobile ad-hoc networks, the goals and attitudes of software\ud
agents cannot be guaranteed. Taking this into account we propose a protocol where interaction among agents is done in a\ud
fully-distributed manner, so that no user can have negotiation privileges over the others
Elckerlyc goes mobile - Enabling natural interaction in mobile user interfaces
The fast growth of computational resources and speech technology available on mobile devices makes it possible to entertain users of these devices in having a natural dialogue with service systems. These systems are sometimes perceived as social agents and this can be supported by presenting them on the interface by means of an animated embodied conversational agent. To take the full advantage of the power of embodied conversational agents in service systems it is important to support real-time, online and responsive interaction with the system through the embodied conversational agent. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the specification and animation of synchronised multi-modal responsive animated agents. This paper presents a new light-weight PictureEngine that allows to run this platform in mobile applications. We describe the integration of the PictureEngine in the user interface of two different coaching applications and discuss the findings from user evaluations. We also conducted a study to evaluate an editing tool for the specification of the agent’s communicative behaviour. Twenty one participants had to specify the behaviour of an embodied conversational agent using the PictureEngine. We may conclude that this new lightweight back-end engine for the Elckerlyc platform makes it easier to build embodied conversational interfaces for mobile devices
Landmarks: Navigating Spacetime and Digital Mobility
In this essay we will examine how we can conceptualize digital mobility as spatial navigation. Digital mobility occurs in media where the user navigates through space
and actually becomes, simultaneously, creator, performer, and navigator of a spatial story. In this sense, the on-screen navigator simultaneously makes and reads space.
We argue that in digital mobilities the user/player becomes simultaneously I-narrator, actor and agent of narrative. The user navigates through space and becomes, in fact, a digital pedestrian. Different from the (virtual) mobility of analogue moving-image media in that the interaction between user and space is much more fluid and the user becomes both actor and navigator, digital mobility is clearly central to the use of
mobile screens, such as mobile phones, navigation devices, or portable game consoles in which case one carries the screen and interacts with it, while being on the move. Moreover, we also believe that digital mobility can be a central quality of certain digital practices during which users are not literally on the move but still have to
navigate through, and control digital environments through spatial interaction. This can for example be the case when playing certain games or consulting Google Earth on a desktop computer
Elckerlyc goes mobile - Enabling natural interaction in mobile user interfaces
The fast growth of computational resources and speech technology available on mobile devices makes it possible to entertain users of these devices in having a natural dialogue with service systems. These systems are sometimes perceived as social agents and this can be supported by presenting them on the interface by means of an animated embodied conversational agent. To take the full advantage of the power of embodied conversational agents in service systems it is important to support real-time, online and responsive interaction with the system through the embodied conversational agent. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the specication and animation of synchronised multi-modal responsive animated agents. This paper presents a new light-weight PictureEngine that allows to run this platform in mobile applications. We describe the integration of the PictureEngine in the user interface of two different coaching applications and discuss the ndings from user evaluations. We also conducted a study to evaluate an editing tool for the specication of the agent’s communicative behaviour. Twenty one participants had to specify the behaviour of an embodied conversational agent using the PictureEngine. We may conclude that this new lightweight back-end engine for the Elckerlyc platform makes it easier to build embodied conversational interfaces for mobile devices
Elckerlyc goes mobile: enabling technology for ECAs in mobile applications
The fast growth of computational resources and speech technology available on mobile devices makes it pos- sible for users of these devices to interact with service sys- tems through natural dialogue. These systems are sometimes perceived as social agents and presented by means of an animated embodied conversational agent (ECA). To take the full advantage of the power of ECAs in service systems, it is important to support real-time, online and responsive interaction with the system through the ECA. The design of responsive animated conversational agents is a daunting task. Elckerlyc is a model-based platform for the specification and animation of synchronised multimodal responsive animated agents. This paper presents a new light-weight PictureEngine that allows this platform to embed an ECA in the user interface of mobile applications. The ECA can be specified by using the behavior markup language (BML). An application and user evaluations of Elckerlyc and the PictureEngine in a mobile embedded digital coach is presented
Enabling Conversational Interaction with Mobile UI using Large Language Models
Conversational agents show the promise to allow users to interact with mobile
devices using language. However, to perform diverse UI tasks with natural
language, developers typically need to create separate datasets and models for
each specific task, which is expensive and effort-consuming. Recently,
pre-trained large language models (LLMs) have been shown capable of
generalizing to various downstream tasks when prompted with a handful of
examples from the target task. This paper investigates the feasibility of
enabling versatile conversational interactions with mobile UIs using a single
LLM. We propose a design space to categorize conversations between the user and
the agent when collaboratively accomplishing mobile tasks. We design prompting
techniques to adapt an LLM to conversational tasks on mobile UIs. The
experiments show that our approach enables various conversational interactions
with decent performances, manifesting its feasibility. We discuss the use cases
of our work and its implications for language-based mobile interaction
DESIGN FOR FAST REQUEST FULFILLMENT OR NATURAL INTERACTION? INSIGHTS FROM AN EXPERIMENT WITH A CONVERSATIONAL AGENT
Conversational agents continue to permeate our lives in different forms, such as virtual assistants on mobile devices or chatbots on websites and social media. The interaction with users through natural language offers various aspects for researchers to study as well as application domains for practitioners to explore. In particular their design represents an interesting phenomenon to investigate as humans show social responses to these agents and successful design remains a challenge in practice. Compared to digital human-to-human communication, text-based conversational agents can provide complementary, preset answer options with which users can conveniently and quickly respond in the interaction. However, their use might also decrease the perceived humanness and social presence of the agent as the user does not respond naturally by thinking of and formulating a reply. In this study, we conducted an experiment with N=80 participants in a customer service context to explore the impact of such elements on agent anthropomorphism and user satisfaction. The results show that their use reduces perceived humanness and social presence yet does not significantly increase service satisfaction. On the contrary, our findings indicate that preset answer options might even be detrimental to service satisfaction as they diminish the natural feel of human-CA interaction
Recommended from our members
mPower: A component-based development framework for multi-agent systems to support business processes
One of the obstacles preventing the widespread adoption of multi-agent systems in industry is the difficulty of implementing heterogeneous interactions among participating agents via asynchronous messages. This difficulty arises from the need to understand how to combine elements of various content languages, ontologies, and interaction protocols in order to construct meaningful and appropriate messages. In this paper mPower, a component-based layered framework for easing the development of multi-agent systems, is described, and the facility for customising the components for reuse in similar domains is explained. The framework builds on the JADE-LEAP platform, which provides a homogeneous layer over diverse operating systems and hardware devices, and allows ubiquitous deployment of applications built on multi-agent systems both in wired and wireless environments. The use of the framework to develop mPowermobile , a multi-agent system to support mobile workforces, is reported
- …