115 research outputs found

    Utilización de los sistemas de diálogo hablado para el acceso a la información en diferentes dominios

    Get PDF
    Ponencias de la Segunda Conferencia internacional sobre brecha digital e inclusión social, celebrada del 28 al 30 de octubre de 2009 en la Universidad Carlos III de MadridLa acción de conversar es el modo más natural para resolver un gran número de acciones cotidianas entre los seres humanos. Por este motivo, un interés histórico dentro del campo de las Tecnologías del Habla ha sido utilizar estas tecnologías en aplicaciones reales, especialmente en aplicaciones que permitan a una persona utilizar su voz para obtener información mediante la interacción directa con una máquina o para controlar un determinado sistema. El objetivo es disponer de sistemas que faciliten la comunicación persona-máquina del modo más natural posible, es decir, a través de la conversación. En esta comunicación se resumen los resultados de la aplicación de estas tecnologías para el desarrollo de diferentes sistemas de diálogo en los que la interacción entre el usuario y el sistema se lleva a cabo mediante habla espontánea en castellano. Para su implementación se ha primado la utilización de diferentes herramientas de software libre para el reconocimiento automático del habla, compresión del lenguaje natural, gestión del diálogo y síntesis de texto a voz. De este modo, el objetivo principal de la comunicación es presentar las principales ventajas que proporcionan los sistemas de diálogo para facilitar el acceso a diferentes servicios dentro de dominios semánticos restringidos, qué posibilidades brinda el uso de herramientas de software libre para su implementación y su evaluación en diferentes casos concretos de aplicación

    Spoken language communication with machines: The long and winding road from research to business

    Get PDF
    Abstract. This paper traces the history of spoken language communication with computers, from the first attempts in the 1950s, through the establishment of the theoretical foundations in the 1980s, to the incremental improvement phase of the 1990s and 2000s. Then a perspective is given on the current conversational technology market and industry, with an analysis of its business value and commercial models

    CSE: U: Mixed-initiative Personal Assistant Agents

    Get PDF
    Specification and implementation of flexible human-computer dialogs is challenging because of the complexity involved in rendering the dialog responsive to a vast number of varied paths through which users might desire to complete the dialog. To address this problem, we developed a toolkit for modeling and implementing task-based, mixed-initiative dialogs based on metaphors from lambda calculus. Our toolkit can automatically operationalize a dialog that involves multiple prompts and/or sub-dialogs, given a high-level dialog specification of it. The use of natural language with the resulting dialogs makes the flexibility in communicating user utterances commensurate with that in dialog completion paths—an aspect missing from commercial assistants like Siri. Our results demonstrate that the dialogs authored with our toolkit support the end user’s completion of a human-computer dialog in a manner that is most natural to them—in a mixed-initiative fashion—that resembles human-human interaction

    Interaktive Dialogsysteme

    Get PDF
    Das Thema der vorliegenden Arbeit kann mit den folgenden drei Hauptfragen zusammengefasst werden: (1) was ist Dialog, (2) welche charakteristische Merkmale hat der Dialog und (3) was ist notwendig, um Dialog erfolgreich mit Hilfe von Maschinen (Computersystemen) zu simulieren. Die Arbeit wurde in der Hoffnung geschrieben, dass sie als Basis und Ausgangspunkt für weitere Forschungen in den bezeichneten wissenschaftlichen Gebieten benutzt werden kann, wie auch in der Hoffnung, dadurch die Aufmerksamkeit von Sprachwissenschaftlerinnen und Germanistinnen auf Forschungsperspektiven und Herausforderungen in diesem Gebiet zu lenken. Dabei wird auch gezeigt, wie weit die theoretischen Grundlagen, die sich auf zwischenmenschliche Dialogformen beziehen, in interaktiven Dialogsystemen implementiert werden können; welche zusätzliche Mittel notwendig sind, um solche Systeme menschenähnlicher in der Dialogkommunikation zu machen. Die Frage nach interaktiven Dialogsystemen wird vor allem aus pragmatischen Gründen gestellt, denn mit der Entwicklung der Computertechnologie werden auch die Einsatzmöglichkeiten für Computer immer mehr. Das führt dazu, dass viel mehr Menschen, vor allem Nicht-Experten, am Computer arbeiten. Die Kommunikation mit dem Computer läuft allerdings, mit Ausnahme der Erfindung und Einführung der Maus in den 1960er/1970er, bis heute noch vor allem über die Konsole (Tastatur/Monitor). Die menschliche Dialogfähigkeit zeichnet sich durch die Fähigkeit aus, typische Organisationsprinzipien des Dialogs zu erkennen und anzuwenden. Die Prinzipien betreffen einerseits die kognitiven Fähigkeiten des Menschen, andererseits unterschiedliche formale Aspekte. Ein Dialogsystem sollte also diese menschliche Dialogfähigkeit in unterschiedlichen kommunikativen Situationen nachahmen können. Eben aus diesem Grund steht einerseits der Dialog als Kommunikationsmittel im Zentrum der vorliegenden Diplomarbeit, andererseits aber die Frage, wie ein gesprochener Dialog mit einem Computersystem und in welchem Ausmaß möglich wäre

    Voice Operated Information System in Slovak

    Get PDF
    Speech communication interfaces (SCI) are nowadays widely used in several domains. Automated spoken language human-computer interaction can replace human-human interaction if needed. Automatic speech recognition (ASR), a key technology of SCI, has been extensively studied during the past few decades. Most of present systems are based on statistical modeling, both at the acoustic and linguistic levels. Increased attention has been paid to speech recognition in adverse conditions recently, since noise-resistance has become one of the major bottlenecks for practical use of speech recognizers. Although many techniques have been developed, many challenges still have to be overcome before the ultimate goal -- creating machines capable of communicating with humans naturally -- can be achieved. In this paper we describe the research and development of the first Slovak spoken language dialogue system. The dialogue system is based on the DARPA Communicator architecture. The proposed system consists of the Galaxy hub and telephony, automatic speech recognition, text-to-speech, backend, transport and VoiceXML dialogue management modules. The SCI enables multi-user interaction in the Slovak language. Functionality of the SLDS is demonstrated and tested via two pilot applications, ``Weather forecast for Slovakia'' and ``Timetable of Slovak Railways''. The required information is retrieved from Internet resources in multi-user mode through PSTN, ISDN, GSM and/or VoIP network

    Generative Goal-driven User Simulation for Dialog Management

    Get PDF
    User simulation is frequently used to train statistical dialog managers for task-oriented domains. At present, goal-driven simulators (those that have a persistent notion of what they wish to achieve in the dialog) require some task-specific engineering, making them impossible to evaluate intrinsically. Instead, they have been evaluated extrinsically by means of the dialog managers they are intended to train, leading to circularity of argument. In this paper, we propose the first fully generative goal-driven simulator that is fully induced from data, without hand-crafting or goal annotation. Our goals are latent, and take the form of topics in a topic model, clustering together semantically equivalent and phonetically confusable strings, implicitly modelling synonymy and speech recognition noise. We evaluate on two standard dialog resources, the Communicator and Let’s Go datasets, and demonstrate that our model has substantially better fit to held out data than competing approaches. We also show that features derived from our model allow significantly greater improvement over a baseline at distinguishing real from randomly permuted dialogs.

    Generative probabilistic models of goal-directed users in task-oriented dialogs

    Get PDF
    A longstanding objective of human-computer interaction research is to develop better dialog systems for end users. The subset of user modelling research specifically, aims to provide dialog researchers with models of user behaviour to aid with the design and improvement of dialog systems. Where dialog systems are commercially deployed, they are often to be used by a vast number of users, where sub-optimal performance could lead to an immediate financial loss for the service provider, and even user alienation. Thus, there is a strong incentive to make dialog systems as functional as possible immediately, and crucially prior to their release to the public. Models of user behaviour fill this gap, by simulating the role of human users in the lab, without the losses associated with sub-optimal system performance. User models can also tremendously aid design decisions, by serving as tools for exploratory analysis of real user behaviour, prior to designing dialog software. User modelling is the central problem of this thesis. We focus on a particular kind of dialogs termed task-oriented dialogs (those centred around solving an explicit task) because they represent the frontier of current dialog research and commercial deployment. Users taking part in these dialogs behave according to a set of user goals, which specify what they wish to accomplish from the interaction, and tend to exhibit variability of behaviour given the same set of goals. Our objective is to capture and reproduce (at the semantic utterance level) the range of behaviour that users exhibit while being consistent with their goals. We approach the problem as an instance of generative probabilistic modelling, with explicit user goals, and induced entirely from data. We argue that doing so has numerous practical and theoretical benefits over previous approaches to user modelling which have either lacked a model of user goals, or have been not been driven by real dialog data. A principal problem with user modelling development thus far has been the difficulty in evaluation. We demonstrate how treating user models as probabilistic models alleviates some of these problems through the ability to leverage a whole raft of techniques and insights from machine learning for evaluation. We demonstrate the efficacy of our approach by applying it to two different kinds of task-oriented dialog domains, which exhibit two different sub-problems encountered in real dialog corpora. The first are informational (or slot-filling) domains, specifically those concerning flight and bus route information. In slot-filling domains, user goals take categorical values which allow multiple surface realisations, and are corrupted by speech recognition errors. We address this issue by adopting a topic model representation of user goals which allows us capture both synonymy and phonetic confusability in a unified model. We first evaluate our model intrinsically using held-out probability and perplexity, and demonstrate substantial gains over an alternative string-goal representations, and over a non-goal-directed model. We then show in an extrinsic evaluation that features derived from our model lead to substantial improvements over strong baseline in the task of discriminating between real dialogs (consistent dialogs) and dialogs comprised of real turns sampled from different dialogs (inconsistent dialogs). We then move on to a spatial navigational domain in which user goals are spatial trajectories across a landscape. The disparity between the representation of spatial routes as raw pixel coordinates and their grounding as semantic utterances creates an interesting challenge compared to conventional slot-filling domains. We derive a feature-based representation of spatial goals which facilitates reasoning and admits generalisation to new routes not encountered at training time. The probabilistic formulation of our model allows us to capture variability of behaviour given the same underlying goal, a property frequently exhibited by human users in the domain. We first evaluate intrinsically using held-out probability and perplexity, and find a substantial reduction in uncertainty brought by our spatial representation. We further evaluate extrinsically in a human judgement task and find that our model’s behaviour does not differ significantly from the behaviour of real users. We conclude by sketching two novel ideas for future work: the first is to deploy the user models as transition functions for MDP-based dialog managers; the second is to use the models as a means of restricting the search space for optimal policies, by treating optimal behaviour as a subset of the (distributions over) plausible behaviour which we have induced

    Information Presentation in Spoken Dialogue Systems

    Get PDF
    To tackle the problem of presenting a large number of options in spoken dialogue systems, we identify compelling options based on a model of user preferences, and present tradeoffs between alternative options explicitly. Multiple attractive options are structured such that the user can gradually refine her request to find the optimal tradeoff. We show that our approach presents complex tradeoffs understandably, increases overall user satisfaction, and significantly improves the user's overview of the available options. Moreover, our results suggest that presenting users with a brief summary of the irrelevant options increases users' confidence in having heard about all relevant options
    corecore