30,898 research outputs found

    Multimodal agent interfaces and system architectures for health and fitness companions

    Get PDF
    Multimodal conversational spoken dialogues using physical and virtual agents provide a potential interface to motivate and support users in the domain of health and fitness. In this paper we present how such multimodal conversational Companions can be implemented to support their owners in various pervasive and mobile settings. In particular, we focus on different forms of multimodality and system architectures for such interfaces

    Tools for modelling support and construction of optimization applications

    Get PDF
    We argue the case for an open systems approach towards modelling and application support. We discuss how the 'usability' and 'skills' analysis naturally leads to a viable strategy for integrating application construction with modelling tools and optimizers. The role of the implementation environment is also seen to be critical in that it is retained as a building block within the resulting system

    Pictorial Socratic dialogue and conceptual change

    Get PDF
    Counter-examples used in a Socratic dialogue aim to provoke reflection to effect conceptual changes. However, natural language forms of Socratic dialogues have their limitations. To address this problem, we propose an alternative form of Socratic dialogue called the pictorial Socratic dialogue. A Spring Balance System has been designed to provide a platform for the investigation of the effects of this pedagogy on conceptual changes. This system allows learners to run and observe an experiment. Qualitative Cartesian graphs are employed for learners to represent their solutions. Indirect and intelligent feedback is prescribed through two approaches in the pictorial Socratic dialogue which aim to provoke learners probe through the perceptual structural features of the problem and solution, into the deeper level of the simulation where Archimedes’ Principle governs

    Exploiting visual salience for the generation of referring expressions

    Get PDF
    In this paper we present a novel approach to generating referring expressions (GRE) that is tailored to a model of the visual context the user is attending to. The approach integrates a new computational model of visual salience in simulated 3-D environments with Dale and Reiter’s (1995) Incremental Algorithm. The advantage of our GRE framework are: (1) the context set used by the GRE algorithm is dynamically computed by the visual saliency algorithm as a user navigates through a simulation; (2) the integration of visual salience into the generation process means that in some instances underspecified but sufficiently detailed descriptions of the target object are generated that are shorter than those generated by GRE algorithms which focus purely on adjectival and type attributes; (3) the integration of visual saliency into the generation process means that our GRE algorithm will in some instances succeed in generating a description of the target object in situations where GRE algorithms which focus purely on adjectival and type attributes fail

    Agent Assistance: From Problem Solving to Music Teaching

    Get PDF
    We report on our research on agents that act and behave in a web learning environment. This research is part of a general approach to agents acting and behaving in virtual environments where they are involved in providing information, performing transactions, demonstrating products and, more generally, assisting users or visitors of the web environment in doing what they want or have been asked to do. While initially we hardly provided our agents with 'teaching knowledge', we now are in the process of making such knowledge explicit, especially in models that take into account that assisting and teaching takes place in a visualized and information-rich environment. Our main (embodied) tutor-agent is called Jacob; it knows about the Towers of Hanoi, a well-known problem that is offered to CS students to learn about recursion. Other agents we are working on assist a visitor in navigating in a virtual world or help the visitor in getting information. We are now designing a music teacher - using knowledge of software engineering and how to design multi-modal interactions, from previous projects

    Desiderata for an Every Citizen Interface to the National Information Infrastructure: Challenges for NLP

    Get PDF
    In this paper, I provide desiderata for an interface that would enable ordinary people to properly access the capabilities of the NII. I identify some of the technologies that will be needed to achieve these desiderata, and discuss current and future research directions that could lead to the development of such technologies. In particular, I focus on the ways in which theory and techniques from natural language processing could contribute to future interfaces to the NII. Introduction The evolving national information infrastructure (NII) has made available a vast array of on-line services and networked information resources in a variety of forms (text, speech, graphics, images, video). At the same time, advances in computing and telecommunications technology have made it possible for an increasing number of households to own (or lease or use) powerful personal computers that are connected to this resource. Accompanying this progress is the expectation that people will be able to more..

    Generating Explanatory Captions for Information Graphics

    Get PDF
    Graphical presentations can be used to communicate information in relational data sets succinctly and effectively. However, novel graphical presentations about numerous attributes and their relationships are often difficult to understand completely until explained. Automatically generated graphical presentations must therefore either be limited to simple, conventional ones, or risk incomprehensibility. One way of alleviating this problem is to design graphical presentation systems that can work in conjunction with a natural language generator to produce "explanatory captions." This paper presents three strategies for generating explanatory captions to accompany information graphics based on: (1) a representation of the structure of the graphical presentation (2) a framework for identifyingthe perceptual complexity of graphical elements, and (3) the structure of the data expressed in the graphic. We describe an implemented system and illustrate how it is used to generate explanatory cap..

    Reference resolution in multi-modal interaction: Preliminary observations

    Get PDF
    In this paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems but do not give general solutions. Rather we decide to perform more detailed research on reference resolution in uni-modal contexts to obtain methods generalizable to multi-modal contexts. Since we try to build applications for a Dutch audience and since hardly any research has been done on reference resolution for Dutch, we give results on the resolution of anaphoric and deictic references in Dutch texts. We hope to be able to extend these results to our multimodal contexts later

    Generating multimedia presentations: from plain text to screenplay

    Get PDF
    In many Natural Language Generation (NLG) applications, the output is limited to plain text – i.e., a string of words with punctuation and paragraph breaks, but no indications for layout, or pictures, or dialogue. In several projects, we have begun to explore NLG applications in which these extra media are brought into play. This paper gives an informal account of what we have learned. For coherence, we focus on the domain of patient information leaflets, and follow an example in which the same content is expressed first in plain text, then in formatted text, then in text with pictures, and finally in a dialogue script that can be performed by two animated agents. We show how the same meaning can be mapped to realisation patterns in different media, and how the expanded options for expressing meaning are related to the perceived style and tone of the presentation. Throughout, we stress that the extra media are not simple added to plain text, but integrated with it: thus the use of formatting, or pictures, or dialogue, may require radical rewording of the text itself
    corecore