15 research outputs found

    Vision-based interaction within a multimodal framework

    Get PDF
    Our contribution is to the field of video-based interaction techniques and is integrated in the home environment of the EMBASSI project. This project addresses innovative methods of man-machine interaction achieved through the development of intelligent assistance and anthropomorphic user interfaces. Within this project, multimodal techniques represent a basic requirement, especially considering those related to the integration of modalities. We are using a stereoscopic approach to allow the natural selection of devices via pointing ges-tures. The pointing hand is segmented from the video images and the 3D position and orientation of the forefinger is calculated. This modality has a subsequent integration with that of speech, in the context of a multimodal interaction infrastructure. In a first phase, we use semantic fusion with amodal input, considering the modalities in a so-called late fusion state

    User interfaces for anyone anywhere

    Get PDF
    In a global context of multimodal man-machine interaction, we approach a wide spectrum of fields, such as software engineering, intelligent communication and speech dialogues. This paper presents technological aspects of the shifting from the traditional desktop interfaces to more expressive, natural, flexible and portable ones, where more persons, in a greater number of situations, will be able to interact with computers. Speech appears to be one of the best forms of interaction, especially in order to support non-skilled users. Modalities such as speech, among others, tend to be very relevant to accessing information in our future society, in which mobile devices will play a preponderant role. Therefore, we are placing an emphasis on verbal communication in open environments (Java/XML) using software agent technology.Fundação para a Ciência e a Tecnologia – PRAXIS XXI /BD/20095/99 ; Germany. Ministry of Science and Education – EMBASSI – 01IL90

    Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

    Get PDF
    This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation präsentiert eine multimodale Architektur zum Gebrauch in mobilen Umständen wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen Umständen analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte Vorführungsprogramme zum \u27stand-alone\u27; Gebrauch auf mobilen Geräten entworfen. Von spezieller Wichtigkeit war der Entwurf und die Ausführung eines Modalitäts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die Ausführung erlaubt die Veränderung von Zuverlässigkeitswerten innerhalb einzelner Modalitäten und außerdem ermöglicht eine Methode um die semantisch überlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische Multimodalität sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben über die bevorzugte Art verschiedener Modalitätskombinationen, sowie auch über die Akzeptanz von anthropomorphisierten Objekten

    Actas do 10º Encontro Português de Computação Gráfica

    Get PDF
    Actas do 10º Encontro Portugês de Computação Gráfica, Lisboa, 1-3 de Outubro de 2001A investigação, o desenvolvimento e o ensino na área da Computação Gráfica constituem, em Portugal, uma realidade positiva e de largas tradições. O Encontro Português de Computação Gráfica (EPCG), realizado no âmbito das actividades do Grupo Português de Computação Gráfica (GPCG), tem permitido reunir regularmente, desde o 1º EPCG realizado também em Lisboa, mas no já longínquo mês de Julho de 1988, todos os que trabalham nesta área abrangente e com inúmeras aplicações. Pela primeira vez no historial destes Encontros, o 10º EPCG foi organizado em ligação estreita com as comunidades do Processamento de Imagem e da Visão por Computador, através da Associação Portuguesa de Reconhecimento de Padrões (APRP), salientando-se, assim, a acrescida colaboração, e a convergência, entre essas duas áreas e a Computação Gráfica. Este é o livro de actas deste 10º EPCG.INSATUniWebIcep PortugalMicrografAutodes

    Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

    Get PDF
    This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation präsentiert eine multimodale Architektur zum Gebrauch in mobilen Umständen wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen Umständen analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte Vorführungsprogramme zum 'stand-alone'; Gebrauch auf mobilen Geräten entworfen. Von spezieller Wichtigkeit war der Entwurf und die Ausführung eines Modalitäts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die Ausführung erlaubt die Veränderung von Zuverlässigkeitswerten innerhalb einzelner Modalitäten und außerdem ermöglicht eine Methode um die semantisch überlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische Multimodalität sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben über die bevorzugte Art verschiedener Modalitätskombinationen, sowie auch über die Akzeptanz von anthropomorphisierten Objekten

    Ambient hues and audible cues: An approach to automotive user interface design using multi-modal feedback

    Get PDF
    The use of touchscreen interfaces for in-vehicle information, entertainment, and for the control of comfort settings is proliferating. Moreover, using these interfaces requires the same visual and manual resources needed for safe driving. Guided by much of the prevalent research in the areas of the human visual system, attention, and multimodal redundancy the Hues and Cues design paradigm was developed to make touchscreen automotive user interfaces more suitable to use while driving. This paradigm was applied to a prototype of an automotive user interface and evaluated with respects to driver performance using the dual-task, Lane Change Test (LCT). Each level of the design paradigm was evaluated in light of possible gender differences. The results of the repeated measures experiment suggests that when compared to interfaces without both the Hues and the Cues paradigm applied, the Hues and Cues interface requires less mental effort to operate, is more usable, and is more preferred. However, the results differ in the degradation in driver performance with interfaces that only have visual feedback resulting in better task times and significant gender differences in the driving task with interfaces that only have auditory feedback. Overall, the results reported show that the presentation of multimodal feedback can be useful in design automotive interfaces, but must be flexible enough to account for individual differences

    A dynamic multi-application dialog engine for task-oriented voice user interfaces

    Get PDF
    This thesis introduces the Dymalog framework for spoken language dialog systems, which separates the applications from the actual dialog system. It facilitates the control of a plurality of applications through a single dialog system, changeable during run time. This is achieved by application-independent knowledge processing inside the dialog system, based on a hierarchical representation of obtained information (o²I -Trees). The approach enables the realization of generic dialog functionalities. Dymalog is composed of a collection of components; each serves mainly a single purpose. It fosters the generation of competing hypotheses during the processing of the user input in order to derive an optimal interpretation at a certain stage in the processing. The Marvin dialog system puts Dymalog into practice. We discuss selected interactions with various applications enabled for the operation through the system. The parameterized hypothesis selection process is considered in detail, especially the parameter estimation algorithm Grail, and the same holds for the development process in the generation of competing hypotheses for the user input.Die Arbeit stellt die Grundlagen zur Realisierung des sprachbasierten Dialogsystems Marvin für die Interaktion eines Benutzers mit verschiedenen Applikationen vor: Dymalog. Es erlaubt die Kontrolle unterschiedlicher Applikationen durch ein einziges System und ermöglicht u.a. dynamische Änderungen der verfügbaren Applikationen zur Laufzeit. Dies wird durch applikationsunabhängige Wissensverarbeitung erreicht, basierend auf modularen ontologischen Beschreibungen der Anwendungsfreiheitsgrade (o²I -Trees). Die Trennung von Dialogsystem und Applikationen ermöglicht die Realisierung generischer Dialogfunktionalitäten. Dymalog besteht aus einer Reihe von separaten Einheiten, jede beinhaltet im Wesentlichen ein Modell zur Verarbeitung der Benutzereingabe. Um die optimale Interpretation der Benutzereingabe zu erlangen wird die Generation alternativer Interpretationen gefördert. Das Marvin Dialogsystem realisiert die Konzepte aus Dymalog. Ausgewählte Interaktionen mit verschiedenen Applikationen werden diskutiert. Ferner wird der parameterisierte Auswahlprozeß der \u27besten\u27; Interpretation beleuchtet, insbesondere der Parameter-Schätzalgorithmus Grail, und die Erzeugung alternativer Hypothesen durch ausgewählte Einheiten diskutiert

    An Integrated Formal Task Specification Method for Smart Environments

    Get PDF
    This thesis is concerned with the development of interactive systems for smart environments. In such scenario different interaction paradigms need to be supported and according methods and development strategies need to be applied to comprise not only explicit interaction (e.g., pressing a button to adjust the light) but also implicit interactions (e.g., walking to the speaker’s desk to give a talk) to assist the user appropriately. A task-based modeling approach is introduced allowing basing the implementing of different interaction paradigms on the same artifact

    Personalized City Tours - An Extension of the OGC OpenLocation Specification

    Get PDF
    A business trip to London last month , a day visit in Cologne next saturday and romantic weekend in Paris in autumn – this example exhibits one of the central characteristics of today’s tourism. People in the western hemisphere take much pleasure in frequent and repeated short term visits of cities. Every city visitor faces the general problems of where to go and what to see in the diverse microcosm of a metropolis. This thesis presents a framework for the generation of personalized city tours - as extension of the Open Location Specification of the Open Geospatial Consortium. It is founded on context-awareness and personalization while at the same time proposing a combined approach to allow for adaption to the user. This framework considers TimeGeography and its algorithmic implementations to be able to cope with spatio-temporal constraints of a city tour. Traveling salesmen problems - for which a heuristic approache is proposed – are subjacent to the tour generation. To meet the requirements of today’s distributed and heterogeneous computing environments, the tour framework comprises individual services that expose standard-compliant interfaces and allow for integration in service oriented architectures

    Context-based multimodal interpretation : an integrated approach to multimodal fusion and discourse processing

    Get PDF
    This thesis is concerned with the context-based interpretation of verbal and nonverbal contributions to interactions in multimodal multiparty dialogue systems. On the basis of a detailed analysis of context-dependent multimodal discourse phenomena, a comprehensive context model is developed. This context model supports the resolution of a variety of referring and elliptical expressions as well as the processing and reactive generation of turn-taking signals and the identification of the intended addressee(s) of a contribution. A major goal of this thesis is the development of a generic component for multimodal fusion and discourse processing. Based on the integration of this component into three distinct multimodal dialogue systems, the generic applicability of the approach is shown.Diese Dissertation befasst sich mit der kontextbasierten Interpretation von verbalen und nonverbalen Gesprächsbeiträgen im Rahmen von multimodalen Dialogsystemen. Im Rahmen dieser Arbeit wird, basierend auf einer detaillierten Analyse multimodaler Diskursphänomene, ein umfassendes Modell des Gesprächskontextes erarbeitet. Dieses Modell soll sowohl die Verarbeitung einer Vielzahl von referentiellen und elliptischen Ausdrücken, als auch die Erzeugung reaktiver Aktionen wie sie für den Sprecherwechsel benötigt werden unterstützen. Ein zentrales Ziel dieser Arbeit ist die Entwicklung einer generischen Komponente zur multimodalen Fusion und Diskursverarbeitung. Anhand der Integration dieser Komponente in drei unterschiedliche Dialogsysteme soll der generische Charakter dieser Komponente gezeigt werden
    corecore