15 research outputs found

    On the Development of Adaptive and User-Centred Interactive Multimodal Interfaces

    Get PDF
    Multimodal systems have attained increased attention in recent years, which has made possible important improvements in the technologies for recognition, processing, and generation of multimodal information. However, there are still many issues related to multimodality which are not clear, for example, the principles that make it possible to resemble human-human multimodal communication. This chapter focuses on some of the most important challenges that researchers have recently envisioned for future multimodal interfaces. It also describes current efforts to develop intelligent, adaptive, proactive, portable and affective multimodal interfaces

    Context-based multimodal interpretation : an integrated approach to multimodal fusion and discourse processing

    Get PDF
    This thesis is concerned with the context-based interpretation of verbal and nonverbal contributions to interactions in multimodal multiparty dialogue systems. On the basis of a detailed analysis of context-dependent multimodal discourse phenomena, a comprehensive context model is developed. This context model supports the resolution of a variety of referring and elliptical expressions as well as the processing and reactive generation of turn-taking signals and the identification of the intended addressee(s) of a contribution. A major goal of this thesis is the development of a generic component for multimodal fusion and discourse processing. Based on the integration of this component into three distinct multimodal dialogue systems, the generic applicability of the approach is shown.Diese Dissertation befasst sich mit der kontextbasierten Interpretation von verbalen und nonverbalen GesprĂ€chsbeitrĂ€gen im Rahmen von multimodalen Dialogsystemen. Im Rahmen dieser Arbeit wird, basierend auf einer detaillierten Analyse multimodaler DiskursphĂ€nomene, ein umfassendes Modell des GesprĂ€chskontextes erarbeitet. Dieses Modell soll sowohl die Verarbeitung einer Vielzahl von referentiellen und elliptischen AusdrĂŒcken, als auch die Erzeugung reaktiver Aktionen wie sie fĂŒr den Sprecherwechsel benötigt werden unterstĂŒtzen. Ein zentrales Ziel dieser Arbeit ist die Entwicklung einer generischen Komponente zur multimodalen Fusion und Diskursverarbeitung. Anhand der Integration dieser Komponente in drei unterschiedliche Dialogsysteme soll der generische Charakter dieser Komponente gezeigt werden

    A flexible and reusable framework for dialogue and action management in multi-party discourse

    Get PDF
    This thesis describes a model for goal-directed dialogue and activity control in real-time for multiple conversation participants that can be human users or virtual characters in multimodal dialogue systems and a framework implementing the model. It is concerned with two genres: task-oriented systems and interactive narratives. The model is based on a representation of participant behavior on three hierarchical levels: dialogue acts, dialogue games, and activities. Dialogue games allow to take advantage of social conventions and obligations to model the basic structure of dialogues. The interactions can be specified and implemented using reoccurring elementary building blocks. Expectations about future behavior of other participants are derived from the state of active dialogue games; this can be useful for, e. g., input disambiguation. The knowledge base of the system is defined in an ontological format and allows individual knowledge and personal traits for the characters. The Conversational Behavior Generation Framework implements the model. It coordinates a set of conversational dialogue engines (CDEs), where each participant is represented by one CDE. The virtual characters can act autonomously, or semi-autonomously follow goals assigned by an external story module (Narrative Mode). The framework allows combining alternative specification methods for the virtual characters\u27; activities (implementation in a general-purpose programming language, by plan operators, or in the specification language Lisa that was developed for the model). The practical viability of the framework was tested and demonstrated via the realization of three systems with different purposes and scope.Diese Arbeit beschreibt ein Modell fĂŒr zielgesteuerte Dialog- und Ablaufsteuerung in Echtzeit fĂŒr beliebig viele menschliche Konversationsteilnehmer und virtuelle Charaktere in multimodalen Dialogsystemen, sowie eine Softwareumgebung, die das Modell implementiert. Dabei werden zwei Genres betrachtet: Task-orientierte Systeme und interaktive ErzĂ€hlungen. Das Modell basiert auf einer ReprĂ€sentation des Teilnehmerverhaltens auf drei hierarchischen Ebenen: Dialogakte, Dialogspiele und AktivitĂ€ten. Dialogspiele erlauben es, soziale Konventionen und Obligationen auszunutzen, um die Dialoge grundlegend zu strukturieren. Die Interaktionen können unter Verwendung wiederkehrender elementarer Bausteine spezifiziert und programmtechnisch implementiert werden. Aus dem Zustand aktiver Dialogspiele werden Erwartungen an das zukĂŒnftige Verhalten der Dialogpartner abgeleitet, die beispielsweise fĂŒr die Desambiguierung von Eingaben von Nutzen sein können. Die Wissensbasis des Systems ist in einem ontologischen Format definiert und ermöglicht individuelles Wissen und persönliche Merkmale fĂŒr die Charaktere. Das Conversational Behavior Generation Framework implementiert das Modell. Es koordiniert eine Menge von Dialog-Engines (CDEs), wobei jedem Teilnehmer eine CDE zugeordet wird, die ihn reprĂ€sentiert. Die virtuellen Charaktere können autonom oder semi-autonom nach den Zielvorgaben eines externen Storymoduls agieren (Narrative Mode). Das Framework erlaubt die Kombination alternativer Spezifikationsarten fĂŒr die AktivitĂ€ten der virtuellen Charaktere (Implementierung in einer allgemeinen Programmiersprache, durch Planoperatoren oder in der fĂŒr das Modell entwickelten Spezifikationssprache Lisa). Die Praxistauglichkeit des Frameworks wurde anhand der Realisierung dreier Systeme mit unterschiedlichen Zielsetzungen und Umfang erprobt und erwiesen

    Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

    Get PDF
    This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation prĂ€sentiert eine multimodale Architektur zum Gebrauch in mobilen UmstĂ€nden wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen UmstĂ€nden analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte VorfĂŒhrungsprogramme zum \u27stand-alone\u27; Gebrauch auf mobilen GerĂ€ten entworfen. Von spezieller Wichtigkeit war der Entwurf und die AusfĂŒhrung eines ModalitĂ€ts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die AusfĂŒhrung erlaubt die VerĂ€nderung von ZuverlĂ€ssigkeitswerten innerhalb einzelner ModalitĂ€ten und außerdem ermöglicht eine Methode um die semantisch ĂŒberlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische MultimodalitĂ€t sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben ĂŒber die bevorzugte Art verschiedener ModalitĂ€tskombinationen, sowie auch ĂŒber die Akzeptanz von anthropomorphisierten Objekten

    Multimodal interaction with mobile devices : fusing a broad spectrum of modality combinations

    Get PDF
    This dissertation presents a multimodal architecture for use in mobile scenarios such as shopping and navigation. It also analyses a wide range of feasible modality input combinations for these contexts. For this purpose, two interlinked demonstrators were designed for stand-alone use on mobile devices. Of particular importance was the design and implementation of a modality fusion module capable of combining input from a range of communication modes like speech, handwriting, and gesture. The implementation is able to account for confidence value biases arising within and between modalities and also provides a method for resolving semantically overlapped input. Tangible interaction with real-world objects and symmetric multimodality are two further themes addressed in this work. The work concludes with the results from two usability field studies that provide insight on user preference and modality intuition for different modality combinations, as well as user acceptance for anthropomorphized objects.Diese Dissertation prĂ€sentiert eine multimodale Architektur zum Gebrauch in mobilen UmstĂ€nden wie z. B. Einkaufen und Navigation. Außerdem wird ein großes Gebiet von möglichen modalen Eingabekombinationen zu diesen UmstĂ€nden analysiert. Um das in praktischer Weise zu demonstrieren, wurden zwei teilweise gekoppelte VorfĂŒhrungsprogramme zum 'stand-alone'; Gebrauch auf mobilen GerĂ€ten entworfen. Von spezieller Wichtigkeit war der Entwurf und die AusfĂŒhrung eines ModalitĂ€ts-fusion Modul, das die Kombination einer Reihe von Kommunikationsarten wie Sprache, Handschrift und Gesten ermöglicht. Die AusfĂŒhrung erlaubt die VerĂ€nderung von ZuverlĂ€ssigkeitswerten innerhalb einzelner ModalitĂ€ten und außerdem ermöglicht eine Methode um die semantisch ĂŒberlappten Eingaben auszuwerten. Wirklichkeitsnaher Dialog mit aktuellen Objekten und symmetrische MultimodalitĂ€t sind zwei weitere Themen die in dieser Arbeit behandelt werden. Die Arbeit schließt mit Resultaten von zwei Feldstudien, die weitere Einsicht erlauben ĂŒber die bevorzugte Art verschiedener ModalitĂ€tskombinationen, sowie auch ĂŒber die Akzeptanz von anthropomorphisierten Objekten

    PRESTK : situation-aware presentation of messages and infotainment content for drivers

    Get PDF
    The amount of in-car information systems has dramatically increased over the last few years. These potentially mutually independent information systems presenting information to the driver increase the risk of driver distraction. In a first step, orchestrating these information systems using techniques from scheduling and presentation planning avoid conflicts when competing for scarce resources such as screen space. In a second step, the cognitive capacity of the driver as another scarce resource has to be considered. For the first step, an algorithm fulfilling the requirements of this situation is presented and evaluated. For the second step, I define the concept of System Situation Awareness (SSA) as an extension of Endsley’s Situation Awareness (SA) model. I claim that not only the driver needs to know what is happening in his environment, but also the system, e.g., the car. In order to achieve SSA, two paths of research have to be followed: (1) Assessment of cognitive load of the driver in an unobtrusive way. I propose to estimate this value using a model based on environmental data. (2) Developing model of cognitive complexity induced by messages presented by the system. Three experiments support the claims I make in my conceptual contribution to this field. A prototypical implementation of the situation-aware presentation management toolkit PRESTK is presented and shown in two demonstrators.In den letzten Jahren hat die Menge der informationsanzeigenden Systeme im Auto drastisch zugenommen. Da sie potenziell unabhĂ€ngig voneinander ablaufen, erhöhen sie die Gefahr, die Aufmerksamkeit des Fahrers abzulenken. Konflikte entstehen, wenn zwei oder mehr Systeme zeitgleich auf limitierte Ressourcen wie z. B. den Bildschirmplatz zugreifen. Ein erster Schritt, diese Konflikte zu vermeiden, ist die Orchestrierung dieser Systeme mittels Techniken aus dem Bereich Scheduling und PrĂ€sentationsplanung. In einem zweiten Schritt sollte die kognitive KapazitĂ€t des Fahrers als ebenfalls limitierte Ressource berĂŒcksichtigt werden. Der Algorithmus, den ich zu Schritt 1 vorstelle und evaluiere, erfĂŒllt alle diese Anforderungen. Zu Schritt 2 definiere ich das Konzept System Situation Awareness (SSA), basierend auf Endsley’s Konzept der Situation Awareness (SA). Dadurch wird erreicht, dass nicht nur der Fahrer sich seiner Umgebung bewusst ist, sondern auch das System (d.h. das Auto). Zu diesem Zweck mšussen zwei Bereiche untersucht werden: (1) Die kognitive Belastbarkeit des Fahrers unaufdringlich ermitteln. Dazu schlage ich ein Modell vor, das auf Umgebungsinformationen basiert. (2) Ein weiteres Modell soll die KomplexitĂ€t der prĂ€sentierten Informationen bestimmen. Drei Experimente stĂŒtzen die Behauptungen in meinem konzeptuellen Beitrag. Ein Prototyp des situationsbewussten PrĂ€sentationsmanagement-Toolkits PresTK wird vorgestellt und in zwei Demonstratoren gezeigt

    Multimodal Reference

    Get PDF

    Multimodal information presentation for high-load human computer interaction

    Get PDF
    This dissertation addresses the question: given an application and an interaction context, how can interfaces present information to users in a way that improves the quality of interaction (e.g. a better user performance, a lower cognitive demand and a greater user satisfaction)? Information presentation is critical to the quality of interaction because it guides, constrains and even determines cognitive behavior. A good presentation is particularly desired in high-load human computer interactions, such as when users are under time pressure, stress, or are multi-tasking. Under a high mental workload, users may not have the spared cognitive capacity to cope with the unnecessary workload induced by a bad presentation. In this dissertation work, the major presentation factor of interest is modality. We have conducted theoretical studies in the cognitive psychology domain, in order to understand the role of presentation modality in different stages of human information processing. Based on the theoretical guidance, we have conducted a series of user studies investigating the effect of information presentation (modality and other factors) in several high-load task settings. The two task domains are crisis management and driving. Using crisis scenario, we investigated how to presentation information to facilitate time-limited visual search and time-limited decision making. In the driving domain, we investigated how to present highly-urgent danger warnings and how to present informative cues that help drivers manage their attention between multiple tasks. The outcomes of this dissertation work have useful implications to the design of cognitively-compatible user interfaces, and are not limited to high-load applications

    BIOMETRIC TECHNOLOGIES FOR AMBIENT INTELLIGENCE

    Get PDF
    Il termine Ambient Intelligence (AmI) si riferisce a un ambiente in grado di riconoscere e rispondere alla presenza di diversi individui in modo trasparente, non intrusivo e spesso invisibile. In questo tipo di ambiente, le persone sono circondate da interfacce uomo macchina intuitive e integrate in oggetti di ogni tipo. Gli scopi dell\u2019AmI sono quelli di fornire un supporto ai servizi efficiente e di facile utilizzo per accrescere le potenzialit\ue0 degli individui e migliorare l\u2019interazioni uomo-macchina. Le tecnologie di AmI possono essere impiegate in contesti come uffici (smart offices), case (smart homes), ospedali (smart hospitals) e citt\ue0 (smart cities). Negli scenari di AmI, i sistemi biometrici rappresentano tecnologie abilitanti al fine di progettare servizi personalizzati per individui e gruppi di persone. La biometria \ue8 la scienza che si occupa di stabilire l\u2019identit\ue0 di una persona o di una classe di persone in base agli attributi fisici o comportamentali dell\u2019individuo. Le applicazioni tipiche dei sistemi biometrici includono: controlli di sicurezza, controllo delle frontiere, controllo fisico dell\u2019accesso e autenticazione per dispositivi elettronici. Negli scenari basati su AmI, le tecnologie biometriche devono funzionare in condizioni non controllate e meno vincolate rispetto ai sistemi biometrici comunemente impiegati. Inoltre, in numerosi scenari applicativi, potrebbe essere necessario utilizzare tecniche in grado di funzionare in modo nascosto e non cooperativo. In questo tipo di applicazioni, i campioni biometrici spesso presentano una bassa qualit\ue0 e i metodi di riconoscimento biometrici allo stato dell\u2019arte potrebbero ottenere prestazioni non soddisfacenti. \uc8 possibile distinguere due modi per migliorare l\u2019applicabilit\ue0 e la diffusione delle tecnologie biometriche negli scenari basati su AmI. Il primo modo consiste nel progettare tecnologie biometriche innovative che siano in grado di funzionare in modo robusto con campioni acquisiti in condizioni non ideali e in presenza di rumore. Il secondo modo consiste nel progettare approcci biometrici multimodali innovativi, in grado di sfruttare a proprio vantaggi tutti i sensori posizionati in un ambiente generico, al fine di ottenere un\u2019elevata accuratezza del riconoscimento ed effettuare autenticazioni continue o periodiche in modo non intrusivo. Il primo obiettivo di questa tesi \ue8 la progettazione di sistemi biometrici innovativi e scarsamente vincolati in grado di migliorare, rispetto allo stato dell\u2019arte attuale, la qualit\ue0 delle tecniche di interazione uomo-macchine in diversi scenari applicativi basati su AmI. Il secondo obiettivo riguarda la progettazione di approcci innovativi per migliorare l\u2019applicabilit\ue0 e l\u2019integrazione di tecnologie biometriche eterogenee negli scenari che utilizzano AmI. In particolare, questa tesi considera le tecnologie biometriche basate su impronte digitali, volto, voce e sistemi multimodali. Questa tesi presenta le seguenti ricerche innovative: \u2022 un metodo per il riconoscimento del parlatore tramite la voce in applicazioni che usano AmI; \u2022 un metodo per la stima dell\u2019et\ue0 dell\u2019individuo da campioni acquisiti in condizioni non-ideali nell\u2019ambito di scenari basati su AmI; \u2022 un metodo per accrescere l\u2019accuratezza del riconoscimento biometrico in modo protettivo della privacy e basato sulla normalizzazione degli score biometrici tramite l\u2019analisi di gruppi di campioni simili tra loro; \u2022 un approccio per la fusione biometrica multimodale indipendente dalla tecnologia utilizzata, in grado di combinare tratti biometrici eterogenei in scenari basati su AmI; \u2022 un approccio per l\u2019autenticazione continua multimodale in applicazioni che usano AmI. Le tecnologie biometriche innovative progettate e descritte in questa tesi sono state validate utilizzando diversi dataset biometrici (sia pubblici che acquisiti in laboratorio), i quali simulano le condizioni che si possono verificare in applicazioni di AmI. I risultati ottenuti hanno dimostrato la realizzabilit\ue0 degli approcci studiati e hanno mostrato che i metodi progettati aumentano l\u2019accuratezza, l\u2019applicabilit\ue0 e l\u2019usabilit\ue0 delle tecnologie biometriche rispetto allo stato dell\u2019arte negli scenari basati su AmI.Ambient Intelligence (AmI) refers to an environment capable of recognizing and responding to the presence of different individuals in a seamless, unobtrusive and often invisible way. In this environment, people are surrounded by intelligent intuitive interfaces that are embedded in all kinds of objects. The goals of AmI are to provide greater user-friendliness, more efficient services support, user-empowerment, and support for human interactions. Examples of AmI scenarios are smart cities, smart homes, smart offices, and smart hospitals. In AmI, biometric technologies represent enabling technologies to design personalized services for individuals or groups of people. Biometrics is the science of establishing the identity of an individual or a class of people based on the physical, or behavioral attributes of the person. Common applications include: security checks, border controls, access control to physical places, and authentication to electronic devices. In AmI, biometric technologies should work in uncontrolled and less-constrained conditions with respect to traditional biometric technologies. Furthermore, in many application scenarios, it could be required to adopt covert and non-cooperative technologies. In these non-ideal conditions, the biometric samples frequently present poor quality, and state-of-the-art biometric technologies can obtain unsatisfactory performance. There are two possible ways to improve the applicability and diffusion of biometric technologies in AmI. The first one consists in designing novel biometric technologies robust to samples acquire in noisy and non-ideal conditions. The second one consists in designing novel multimodal biometric approaches that are able to take advantage from all the sensors placed in a generic environment in order to achieve high recognition accuracy and to permit to perform continuous or periodic authentications in an unobtrusive manner. The first goal of this thesis is to design innovative less-constrained biometric systems, which are able to improve the quality of the human-machine interaction in different AmI environments with respect to the state-of-the-art technologies. The second goal is to design novel approaches to improve the applicability and integration of heterogeneous biometric systems in AmI. In particular, the thesis considers technologies based on fingerprint, face, voice, and multimodal biometrics. This thesis presents the following innovative research studies: \u2022 a method for text-independent speaker identification in AmI applications; \u2022 a method for age estimation from non-ideal samples acquired in AmI scenarios; \u2022 a privacy-compliant cohort normalization technique to increase the accuracy of already deployed biometric systems; \u2022 a technology-independent multimodal fusion approach to combine heterogeneous traits in AmI scenarios; \u2022 a multimodal continuous authentication approach for AmI applications. The designed novel biometric technologies have been tested on different biometric datasets (both public and collected in our laboratory) simulating the acquisitions performed in AmI applications. Results proved the feasibility of the studied approaches and shown that the studied methods effectively increased the accuracy, applicability, and usability of biometric technologies in AmI with respect to the state-of-the-art
    corecore