63 research outputs found

    Context-based multimodal interpretation : an integrated approach to multimodal fusion and discourse processing

    Get PDF
    This thesis is concerned with the context-based interpretation of verbal and nonverbal contributions to interactions in multimodal multiparty dialogue systems. On the basis of a detailed analysis of context-dependent multimodal discourse phenomena, a comprehensive context model is developed. This context model supports the resolution of a variety of referring and elliptical expressions as well as the processing and reactive generation of turn-taking signals and the identification of the intended addressee(s) of a contribution. A major goal of this thesis is the development of a generic component for multimodal fusion and discourse processing. Based on the integration of this component into three distinct multimodal dialogue systems, the generic applicability of the approach is shown.Diese Dissertation befasst sich mit der kontextbasierten Interpretation von verbalen und nonverbalen Gesprächsbeiträgen im Rahmen von multimodalen Dialogsystemen. Im Rahmen dieser Arbeit wird, basierend auf einer detaillierten Analyse multimodaler Diskursphänomene, ein umfassendes Modell des Gesprächskontextes erarbeitet. Dieses Modell soll sowohl die Verarbeitung einer Vielzahl von referentiellen und elliptischen Ausdrücken, als auch die Erzeugung reaktiver Aktionen wie sie für den Sprecherwechsel benötigt werden unterstützen. Ein zentrales Ziel dieser Arbeit ist die Entwicklung einer generischen Komponente zur multimodalen Fusion und Diskursverarbeitung. Anhand der Integration dieser Komponente in drei unterschiedliche Dialogsysteme soll der generische Charakter dieser Komponente gezeigt werden

    Towards multi-domain speech understanding with flexible and dynamic vocabulary

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 201-208).In developing telephone-based conversational systems, we foresee future systems capable of supporting multiple domains and flexible vocabulary. Users can pursue several topics of interest within a single telephone call, and the system is able to switch transparently among domains within a single dialog. This system is able to detect the presence of any out-of-vocabulary (OOV) words, and automatically hypothesizes each of their pronunciation, spelling and meaning. These can be confirmed with the user and the new words are subsequently incorporated into the recognizer lexicon for future use. This thesis will describe our work towards realizing such a vision, using a multi-stage architecture. Our work is focused on organizing the application of linguistic constraints in order to accommodate multiple domain topics and dynamic vocabulary at the spoken input. The philosophy is to exclusively apply below word-level linguistic knowledge at the initial stage. Such knowledge is domain-independent and general to all of the English language. Hence, this is broad enough to support any unknown words that may appear at the input, as well as input from several topic domains. At the same time, the initial pass narrows the search space for the next stage, where domain-specific knowledge that resides at the word-level or above is applied. In the second stage, we envision several parallel recognizers, each with higher order language models tailored specifically to its domain. A final decision algorithm selects a final hypothesis from the set of parallel recognizers.(cont.) Part of our contribution is the development of a novel first stage which attempts to maximize linguistic constraints, using only below word-level information. The goals are to prevent sequences of unknown words from being pruned away prematurely while maintaining performance on in-vocabulary items, as well as reducing the search space for later stages. Our solution coordinates the application of various subword level knowledge sources. The recognizer lexicon is implemented with an inventory of linguistically motivated units called morphs, which are syllables augmented with spelling and word position. This first stage is designed to output a phonetic network so that we are not committed to the initial hypotheses. This adds robustness, as later stages can propose words directly from phones. To maximize performance on the first stage, much of our focus has centered on the integration of a set of hierarchical sublexical models into this first pass. To do this, we utilize the ANGIE framework which supports a trainable context-free grammar, and is designed to acquire subword-level and phonological information statistically. Its models can generalize knowledge about word structure, learned from in-vocabulary data, to previously unseen words. We explore methods for collapsing the ANGIE models into a finite-state transducer (FST) representation which enables these complex models to be efficiently integrated into recognition. The ANGIE-FST needs to encapsulate the hierarchical knowledge of ANGIE and replicate ANGIE's ability to support previously unobserved phonetic sequences ...by Grace Chung.Ph.D

    Adaptive Cognitive Interaction Systems

    Get PDF
    Adaptive kognitive Interaktionssysteme beobachten und modellieren den Zustand ihres Benutzers und passen das Systemverhalten entsprechend an. Ein solches System besteht aus drei Komponenten: Dem empirischen kognitiven Modell, dem komputationalen kognitiven Modell und dem adaptiven Interaktionsmanager. Die vorliegende Arbeit enthält zahlreiche Beiträge zur Entwicklung dieser Komponenten sowie zu deren Kombination. Die Ergebnisse werden in zahlreichen Benutzerstudien validiert

    A Generic and Visual Interfacing Framework for Bridging the Interface between Application Systems and Recognizers *

    No full text
    Application systems that utilize recognition technologies such as speech, gesture, and color recognition provide human-machine interfacing to those users that are physically unable to interact with computers through traditional input devices such as mouse or keyboard. Current solutions to interface application systems with recognizers, however, use an ad hoc approach and lack of a generic and systematic way. The common approach used is to interface with recognizers through low-level programmed wrappers that are application dependent and require the details of system design and programming knowledge to perform the interfacing and to make any modifications to it. Thus, a generic and systematic approach to bridge the interface between recognizers and application systems must be quested. In this research work, we provide a generic and visual interfacing framework for bridging the interface between application systems and recognizers through the application system’s front end, applying a visual level interfacing without requiring the detailed system design and programming knowledge, allowing for modifications to an interfacing environment to be made on the fly and more importantly allowing the interfacing wit

    Constructing a low-cost, open-source, VoiceXML

    Get PDF
    Voice-enabled applications, applications that interact with a user via an audio channel, are used extensively today. Their use is growing as speech related technologies improve, as speech is one of the most natural methods of interaction. They can provide customer support as IVRs, can be used as an assistive technology, or can become an aural interface to the Internet. Given that the telephone is used extensively throughout the globe, the number of potential users of voice-enabled applications is very high. VoiceXML is a popular, open, high-level, standard means of creating voice-enabled applications which was designed to bring the benefits of web based development to services. While VoiceXML is an ideal language for creating these applications, VoiceXML gateways, the hardware and software responsible for interpreting VoiceXML applications and interfacing with the PSTN, are still expensive and so there is a need for a low-cost gateway. Asterisk, and open-source, TDM/VoIP telephony platform, can be used as a low-cost PSTN interface. This thesis investigates adding a VoiceXML service to Asterisk, creating a low-cost VoiceXML prototype gateway which is able to render voice-enabled applications. Following the Component-Based Software Engineering (CBSE) paradigm, the VoiceXML gateway is divided into a set of components which are sourced from the open-source community, and integrated to create the gateway. The browser requires a VoiceXML interpreter (OpenVXI), a Text-To-Speech engine (Festival) and a speech recognition engine (Sphinx 4). The integration of the components results in a low-cost, open-source VoiceXML gateway. System tests show that the integration of the components was successful, and that the system can handle concurrent calls. A fully compliant version of the gateway can be used in the real world to render voice-enabled applications at a low cost.KMBT_363Adobe Acrobat 9.55 Paper Capture Plug-i

    The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe

    Get PDF
    Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man

    Evolutionary design assistants for architecture

    Get PDF
    In its parallel pursuit of an increased competitivity for design offices and more pleasurable and easier workflows for designers, artificial design intelligence is a technical, intellectual, and political challenge. While human-machine cooperation has become commonplace through Computer Aided Design (CAD) tools, a more improved collaboration and better support appear possible only through an endeavor into a kind of artificial design intelligence, which is more sensitive to the human perception of affairs. Considered as part of the broader Computational Design studies, the research program of this quest can be called Artificial / Autonomous / Automated Design (AD). The current available level of Artificial Intelligence (AI) for design is limited and a viable aim for current AD would be to develop design assistants that are capable of producing drafts for various design tasks. Thus, the overall aim of this thesis is the development of approaches, techniques, and tools towards artificial design assistants that offer a capability for generating drafts for sub-tasks within design processes. The main technology explored for this aim is Evolutionary Computation (EC), and the target design domain is architecture. The two connected research questions of the study concern, first, the investigation of the ways to develop an architectural design assistant, and secondly, the utilization of EC for the development of such assistants. While developing approaches, techniques, and computational tools for such an assistant, the study also carries out a broad theoretical investigation into the main problems, challenges, and requirements towards such assistants on a rather overall level. Therefore, the research is shaped as a parallel investigation of three main threads interwoven along several levels, moving from a more general level to specific applications. The three research threads comprise, first, theoretical discussions and speculations with regard to both existing literature and the proposals and applications of the thesis; secondly, proposals for descriptive and prescriptive models, mappings, summary illustrations, task structures, decomposition schemes, and integratory frameworks; and finally, experimental applications of these proposals. This tripartite progression allows an evaluation of each proposal both conceptually and practically; thereby, enabling a progressive improvement of the understanding regarding the research question, while producing concrete outputs on the way. Besides theoretical and interpretative examinations, the thesis investigates its subject through a set of practical and speculative proposals, which function as both research instruments and the outputs of the study. The first main output of the study is the “design_proxy” approach (d_p), which is an integrated approach for draft making design assistants. It is an outcome of both theoretical examinations and experimental applications, and proposes an integration of, (1) flexible and relaxed task definitions and representations (instead of strict formalisms), (2) intuitive interfaces that make use of usual design media, (3) evaluation of solution proposals through their similarity to given examples, and (4) a dynamic evolutionary approach for solution generation. The design_proxy approach may be useful for AD researchers that aim at developing practical design assistants, as has been examined and demonstrated with the two applications, i.e., design_proxy.graphics and design_proxy.layout. The second main output, the “Interleaved Evolutionary Algorithm” (IEA, or Interleaved EA) is a novel evolutionary algorithm proposed and used as the underlying generative mechanism of design_proxybased design assistants. The Interleaved EA is a dynamic, adaptive, and multi-objective EA, in which one of the objectives leads the evolution until its fitness progression stagnates; in the sense that the settings and fitness values of this objective is used for most evolutionary decisions. In this way, the Interleaved EA enables the use of different settings and operators for each of the objectives within an overall task, which would be the same for all objectives in a regular multi-objective EA. This property gives the algorithm a modular structure, which offers an improvable method for the utilization of domain-specific knowledge for each sub-task, i.e., objective. The Interleaved EA can be used by Evolutionary Computation (EC) researchers and by practitioners who employ EC for their tasks. As a third main output, the “Architectural Stem Cells Framework” is a conceptual framework for architectural design assistants. It proposes a dynamic and multi-layered method for combining a set of design assistants for larger tasks in architectural design. The first component of the framework is a layer-based, parallel task decomposition approach, which aims at obtaining a dynamic parallelization of sub-tasks within a more complicated problem. The second component of the framework is a conception for the development mechanisms for building drafts, i.e., Architectural Stem Cells (ASC). An ASC can be conceived as a semantically marked geometric structure, which contains the information that specifies the possibilities and constraints for how an abstract building may develop from an undetailed stage to a fully developed building draft. ASCs are required for re-integrating the separated task layers of an architectural problem through solution-based development. The ASC Framework brings together many of the ideas of this thesis for a practical research agenda and it is presented to the AD researchers in architecture. Finally, the “design_proxy.layout” (d_p.layout) is an architectural layout design assistant based on the design_proxy approach and the IEA. The system uses a relaxed problem definition (producing draft layouts) and a flexible layout representation that permits the overlapping of design units and boundaries. User interaction with the system is carried out through intuitive 2D graphics and the functional evaluations are performed by measuring the similarity of a proposal to existing layouts. Functioning in an integrated manner, these properties make the system a practicable and enjoying design assistant, which was demonstrated through two workshop cases. The d_p.layout is a versatile and robust layout design assistant that can be used by architects in their design processes
    • …
    corecore