19 research outputs found

    Utilizaci贸n de los sistemas de di谩logo hablado para el acceso a la informaci贸n en diferentes dominios

    Get PDF
    Ponencias de la Segunda Conferencia internacional sobre brecha digital e inclusi贸n social, celebrada del 28 al 30 de octubre de 2009 en la Universidad Carlos III de MadridLa acci贸n de conversar es el modo m谩s natural para resolver un gran n煤mero de acciones cotidianas entre los seres humanos. Por este motivo, un inter茅s hist贸rico dentro del campo de las Tecnolog铆as del Habla ha sido utilizar estas tecnolog铆as en aplicaciones reales, especialmente en aplicaciones que permitan a una persona utilizar su voz para obtener informaci贸n mediante la interacci贸n directa con una m谩quina o para controlar un determinado sistema. El objetivo es disponer de sistemas que faciliten la comunicaci贸n persona-m谩quina del modo m谩s natural posible, es decir, a trav茅s de la conversaci贸n. En esta comunicaci贸n se resumen los resultados de la aplicaci贸n de estas tecnolog铆as para el desarrollo de diferentes sistemas de di谩logo en los que la interacci贸n entre el usuario y el sistema se lleva a cabo mediante habla espont谩nea en castellano. Para su implementaci贸n se ha primado la utilizaci贸n de diferentes herramientas de software libre para el reconocimiento autom谩tico del habla, compresi贸n del lenguaje natural, gesti贸n del di谩logo y s铆ntesis de texto a voz. De este modo, el objetivo principal de la comunicaci贸n es presentar las principales ventajas que proporcionan los sistemas de di谩logo para facilitar el acceso a diferentes servicios dentro de dominios sem谩nticos restringidos, qu茅 posibilidades brinda el uso de herramientas de software libre para su implementaci贸n y su evaluaci贸n en diferentes casos concretos de aplicaci贸n

    Speech Standards: Lessons Learnt

    Get PDF
    During the past decades, the landscape of speech and DTMF applications has changed from being based on proprietary platforms to being completely based on speech standards. The W3C Voice Browser Working Group played a primary goal in this change. This chapter describes that change, highlights the standards created by the W3C VBWG, and discusses the benefits that these standards have provided in many other application fields, including multi-modal interfaces

    Towards a Neural Era in Dialogue Management for Collaboration: A Literature Survey

    Full text link
    Dialogue-based human-AI collaboration can revolutionize collaborative problem-solving, creative exploration, and social support. To realize this goal, the development of automated agents proficient in skills such as negotiating, following instructions, establishing common ground, and progressing shared tasks is essential. This survey begins by reviewing the evolution of dialogue management paradigms in collaborative dialogue systems, from traditional handcrafted and information-state based methods to AI planning-inspired approaches. It then shifts focus to contemporary data-driven dialogue management techniques, which seek to transfer deep learning successes from form-filling and open-domain settings to collaborative contexts. The paper proceeds to analyze a selected set of recent works that apply neural approaches to collaborative dialogue management, spotlighting prevailing trends in the field. This survey hopes to provide foundational background for future advancements in collaborative dialogue management, particularly as the dialogue systems community continues to embrace the potential of large language models

    A computational framework for mixed-initiative dialog modeling.

    Get PDF
    Chan, Shuk Fong.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 114-122).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview --- p.1Chapter 1.2 --- Thesis Contributions --- p.5Chapter 1.3 --- Thesis Outline --- p.9Chapter 2 --- Background --- p.10Chapter 2.1 --- Mixed-Initiative Interactions --- p.11Chapter 2.2 --- Mixed-Initiative Spoken Dialog Systems --- p.14Chapter 2.2.1 --- Finite-state Networks --- p.16Chapter 2.2.2 --- Form-based Approaches --- p.17Chapter 2.2.3 --- Sequential Decision Approaches --- p.18Chapter 2.2.4 --- Machine Learning Approaches --- p.20Chapter 2.3 --- Understanding Mixed-Initiative Dialogs --- p.24Chapter 2.4 --- Cooperative Response Generation --- p.26Chapter 2.4.1 --- Plan-based Approach --- p.27Chapter 2.4.2 --- Constraint-based Approach --- p.28Chapter 2.5 --- Chapter Summary --- p.29Chapter 3 --- Mixed-Initiative Dialog Management in the ISIS system --- p.30Chapter 3.1 --- The ISIS Domain --- p.31Chapter 3.1.1 --- System Overview --- p.31Chapter 3.1.2 --- Domain-Specific Constraints --- p.33Chapter 3.2 --- Discourse and Dialog --- p.34Chapter 3.2.1 --- Discourse Inheritance --- p.37Chapter 3.2.2 --- Mixed-Initiative Dialogs --- p.41Chapter 3.3 --- Challenges and New Directions --- p.45Chapter 3.3.1 --- A Learning System --- p.46Chapter 3.3.2 --- Combining Interaction and Delegation Subdialogs --- p.49Chapter 3.4 --- Chapter Summary --- p.57Chapter 4 --- Understanding Mixed-Initiative Human-Human Dialogs --- p.59Chapter 4.1 --- The CU Restaurants Domain --- p.60Chapter 4.2 --- "Task Goals, Dialog Acts, Categories and Annotation" --- p.61Chapter 4.2.1 --- Task Goals and Dialog Acts --- p.61Chapter 4.2.2 --- Semantic and Syntactic Categories --- p.64Chapter 4.2.3 --- Annotating the Training Sentences --- p.65Chapter 4.3 --- Selective Inheritance Strategy --- p.67Chapter 4.3.1 --- Category Inheritance Rules --- p.67Chapter 4.3.2 --- Category Refresh Rules --- p.73Chapter 4.4 --- Task Goal and Dialog Act Identification --- p.78Chapter 4.4.1 --- Belief Networks Development --- p.78Chapter 4.4.2 --- Varying the Input Dimensionality --- p.80Chapter 4.4.3 --- Evaluation --- p.80Chapter 4.5 --- Procedure for Discourse Inheritance --- p.83Chapter 4.6 --- Chapter Summary --- p.86Chapter 5 --- Cooperative Response Generation in Mixed-Initiative Dialog Modeling --- p.88Chapter 5.1 --- System Overview --- p.89Chapter 5.1.1 --- State Space Generation --- p.89Chapter 5.1.2 --- Task Goal and Dialog Act Generation for System Response --- p.92Chapter 5.1.3 --- Response Frame Generation --- p.93Chapter 5.1.4 --- Text Generation --- p.100Chapter 5.2 --- Experiments and Results --- p.100Chapter 5.2.1 --- Subjective Results --- p.103Chapter 5.2.2 --- Objective Results --- p.105Chapter 5.3 --- Chapter Summary --- p.105Chapter 6 --- Conclusions --- p.108Chapter 6.1 --- Summary --- p.108Chapter 6.2 --- Contributions --- p.110Chapter 6.3 --- Future Work --- p.111Bibliography --- p.113Chapter A --- Domain-Specific Task Goals in CU Restaurants Domain --- p.123Chapter B --- Full list of VERBMOBIL-2 Dialog Acts --- p.124Chapter C --- Dialog Acts for Customer Requests and Waiter Responses in CU Restaurants Domain --- p.125Chapter D --- The Two Grammers for Task Goal and Dialog Act Identifi- cation --- p.130Chapter E --- Category Inheritance Rules --- p.143Chapter F --- Category Refresh Rules --- p.149Chapter G --- Full list of Response Trigger Words --- p.154Chapter H --- Evaluation Test Questionnaire for Dialog System in CU Restaurants Domain --- p.159Chapter I --- Details of the statistical testing Regarding Grice's Maxims and User Satisfaction --- p.16

    Desarrollo y evaluaci贸n de diferentes metodolog铆as para la gesti贸n autom谩tica del di谩logo

    Full text link
    El objetivo principal de la tesis que se presenta es el estudio y desarrollo de diferentes metodolog铆as para la gesti贸n del di谩logo en sistemas de di谩logo hablado. El principal reto planteado en la tesis reside en el desarrollo de metodolog铆as puramente estad铆sticas para la gesti贸n del di谩logo, basadas en el aprendizaje de un modelo a partir de un corpus de di谩logos etiquetados. En este campo, se presentan diferentes aproximaciones para realizar la gesti贸n, la mejora del modelo estad铆stico y la evaluaci贸n del sistema del di谩logo. Para la implementaci贸n pr谩ctica de estas metodolog铆as, en el 谩mbito de una tarea espec铆fica, ha sido necesaria la adquisici贸n y etiquetado de un corpus de di谩logos. El hecho de disponer de un gran corpus de di谩logos ha facilitado el aprendizaje y evaluaci贸n del modelo de gesti贸n desarrollado. As铆 mismo, se ha implementado un sistema de di谩logo completo, que permite evaluar el funcionamiento pr谩ctico de las metodolog铆as de gesti贸n en condiciones reales de uso. Para evaluar las t茅cnicas de gesti贸n del di谩logo se proponen diferentes aproximaciones: la evaluaci贸n mediante usuarios reales; la evaluaci贸n con el corpus adquirido, en el cual se han definido unas particiones de entrenamiento y prueba; y la utilizaci贸n de t茅cnicas de simulaci贸n de usuarios. El simulador de usuario desarrollado permite modelizar de forma estad铆stica el proceso completo del di谩logo. En la aproximaci贸n que se presenta, tanto la obtenci贸n de la respuesta del sistema como la generaci贸n del turno de usuario se modelizan como un problema de clasificaci贸n, para el que se codifica como entrada un conjunto de variables que representan el estado actual del di谩logo y como resultado de la clasificaci贸n se obtienen las probabilidades de seleccionar cada una de las respuestas (secuencia de actos de di谩logo) definidas respectivamente para el usuario y el sistema.Griol Barres, D. (2007). Desarrollo y evaluaci贸n de diferentes metodolog铆as para la gesti贸n autom谩tica del di谩logo [Tesis doctoral no publicada]. Universitat Polit猫cnica de Val猫ncia. https://doi.org/10.4995/Thesis/10251/1956Palanci

    Semi-automatic acquisition of domain-specific semantic structures.

    Get PDF
    Siu, Kai-Chung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 99-106).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Thesis Outline --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Natural Language Understanding --- p.6Chapter 2.1.1 --- Rule-based Approaches --- p.7Chapter 2.1.2 --- Stochastic Approaches --- p.8Chapter 2.1.3 --- Phrase-Spotting Approaches --- p.9Chapter 2.2 --- Grammar Induction --- p.10Chapter 2.2.1 --- Semantic Classification Trees --- p.11Chapter 2.2.2 --- Simulated Annealing --- p.12Chapter 2.2.3 --- Bayesian Grammar Induction --- p.12Chapter 2.2.4 --- Statistical Grammar Induction --- p.13Chapter 2.3 --- Machine Translation --- p.14Chapter 2.3.1 --- Rule-based Approach --- p.15Chapter 2.3.2 --- Statistical Approach --- p.15Chapter 2.3.3 --- Example-based Approach --- p.16Chapter 2.3.4 --- Knowledge-based Approach --- p.16Chapter 2.3.5 --- Evaluation Method --- p.19Chapter 3 --- Semi-Automatic Grammar Induction --- p.20Chapter 3.1 --- Agglomerative Clustering --- p.20Chapter 3.1.1 --- Spatial Clustering --- p.21Chapter 3.1.2 --- Temporal Clustering --- p.24Chapter 3.1.3 --- Free Parameters --- p.26Chapter 3.2 --- Post-processing --- p.27Chapter 3.3 --- Chapter Summary --- p.29Chapter 4 --- Application to the ATIS Domain --- p.30Chapter 4.1 --- The ATIS Domain --- p.30Chapter 4.2 --- Parameters Selection --- p.32Chapter 4.3 --- Unsupervised Grammar Induction --- p.35Chapter 4.4 --- Prior Knowledge Injection --- p.40Chapter 4.5 --- Evaluation --- p.43Chapter 4.5.1 --- Parse Coverage in Understanding --- p.45Chapter 4.5.2 --- Parse Errors --- p.46Chapter 4.5.3 --- Analysis --- p.47Chapter 4.6 --- Chapter Summary --- p.49Chapter 5 --- Portability to Chinese --- p.50Chapter 5.1 --- Corpus Preparation --- p.50Chapter 5.1.1 --- Tokenization --- p.51Chapter 5.2 --- Experiments --- p.52Chapter 5.2.1 --- Unsupervised Grammar Induction --- p.52Chapter 5.2.2 --- Prior Knowledge Injection --- p.56Chapter 5.3 --- Evaluation --- p.58Chapter 5.3.1 --- Parse Coverage in Understanding --- p.59Chapter 5.3.2 --- Parse Errors --- p.60Chapter 5.4 --- Grammar Comparison Across Languages --- p.60Chapter 5.5 --- Chapter Summary --- p.64Chapter 6 --- Bi-directional Machine Translation --- p.65Chapter 6.1 --- Bilingual Dictionary --- p.67Chapter 6.2 --- Concept Alignments --- p.68Chapter 6.3 --- Translation Procedures --- p.73Chapter 6.3.1 --- The Matching Process --- p.74Chapter 6.3.2 --- The Searching Process --- p.76Chapter 6.3.3 --- Heuristics to Aid Translation --- p.81Chapter 6.4 --- Evaluation --- p.82Chapter 6.4.1 --- Coverage --- p.83Chapter 6.4.2 --- Performance --- p.86Chapter 6.5 --- Chapter Summary --- p.89Chapter 7 --- Conclusions --- p.90Chapter 7.1 --- Summary --- p.90Chapter 7.2 --- Future Work --- p.92Chapter 7.2.1 --- Suggested Improvements on Grammar Induction Process --- p.92Chapter 7.2.2 --- Suggested Improvements on Bi-directional Machine Trans- lation --- p.96Chapter 7.2.3 --- Domain Portability --- p.97Chapter 7.3 --- Contributions --- p.97Bibliography --- p.99Chapter A --- Original SQL Queries --- p.107Chapter B --- Induced Grammar --- p.109Chapter C --- Seeded Categories --- p.11

    The use of belief networks in natural language understanding and dialog modeling.

    Get PDF
    Wai, Chi Man Carmen.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 129-136).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview --- p.1Chapter 1.2 --- Natural Language Understanding --- p.3Chapter 1.3 --- BNs for Handling Speech Recognition Errors --- p.4Chapter 1.4 --- BNs for Dialog Modeling --- p.5Chapter 1.5 --- Thesis Goals --- p.8Chapter 1.6 --- Thesis Outline --- p.8Chapter 2 --- Background --- p.10Chapter 2.1 --- Natural Language Understanding --- p.11Chapter 2.1.1 --- Rule-based Approaches --- p.12Chapter 2.1.2 --- Stochastic Approaches --- p.13Chapter 2.1.3 --- Phrase-Spotting Approaches --- p.16Chapter 2.2 --- Handling Recognition Errors in Spoken Queries --- p.17Chapter 2.3 --- Spoken Dialog Systems --- p.19Chapter 2.3.1 --- Finite-State Networks --- p.21Chapter 2.3.2 --- The Form-based Approaches --- p.21Chapter 2.3.3 --- Sequential Decision Approaches --- p.22Chapter 2.3.4 --- Machine Learning Approaches --- p.24Chapter 2.4 --- Belief Networks --- p.27Chapter 2.4.1 --- Introduction --- p.27Chapter 2.4.2 --- Bayesian Inference --- p.29Chapter 2.4.3 --- Applications of the Belief Networks --- p.32Chapter 2.5 --- Chapter Summary --- p.33Chapter 3 --- Belief Networks for Natural Language Understanding --- p.34Chapter 3.1 --- The ATIS Domain --- p.35Chapter 3.2 --- Problem Formulation --- p.36Chapter 3.3 --- Semantic Tagging --- p.37Chapter 3.4 --- Belief Networks Development --- p.38Chapter 3.4.1 --- Concept Selection --- p.39Chapter 3.4.2 --- Bayesian Inferencing --- p.40Chapter 3.4.3 --- Thresholding --- p.40Chapter 3.4.4 --- Goal Identification --- p.41Chapter 3.5 --- Experiments on Natural Language Understanding --- p.42Chapter 3.5.1 --- Comparison between Mutual Information and Informa- tion Gain --- p.42Chapter 3.5.2 --- Varying the Input Dimensionality --- p.44Chapter 3.5.3 --- Multiple Goals and Rejection --- p.46Chapter 3.5.4 --- Comparing Grammars --- p.47Chapter 3.6 --- Benchmark with Decision Trees --- p.48Chapter 3.7 --- Performance on Natural Language Understanding --- p.51Chapter 3.8 --- Handling Speech Recognition Errors in Spoken Queries --- p.52Chapter 3.8.1 --- Corpus Preparation --- p.53Chapter 3.8.2 --- Enhanced Belief Network Topology --- p.54Chapter 3.8.3 --- BNs for Handling Speech Recognition Errors --- p.55Chapter 3.8.4 --- Experiments on Handling Speech Recognition Errors --- p.60Chapter 3.8.5 --- Significance Testing --- p.64Chapter 3.8.6 --- Error Analysis --- p.65Chapter 3.9 --- Chapter Summary --- p.67Chapter 4 --- Belief Networks for Mixed-Initiative Dialog Modeling --- p.68Chapter 4.1 --- The CU FOREX Domain --- p.69Chapter 4.1.1 --- Domain-Specific Constraints --- p.69Chapter 4.1.2 --- Two Interaction Modalities --- p.70Chapter 4.2 --- The Belief Networks --- p.70Chapter 4.2.1 --- Informational Goal Inference --- p.72Chapter 4.2.2 --- Detection of Missing / Spurious Concepts --- p.74Chapter 4.3 --- Integrating Two Interaction Modalities --- p.78Chapter 4.4 --- Incorporating Out-of-Vocabulary Words --- p.80Chapter 4.4.1 --- Natural Language Queries --- p.80Chapter 4.4.2 --- Directed Queries --- p.82Chapter 4.5 --- Evaluation of the BN-based Dialog Model --- p.84Chapter 4.6 --- Chapter Summary --- p.87Chapter 5 --- Scalability and Portability of Belief Network-based Dialog Model --- p.88Chapter 5.1 --- Migration to the ATIS Domain --- p.89Chapter 5.2 --- Scalability of the BN-based Dialog Model --- p.90Chapter 5.2.1 --- Informational Goal Inference --- p.90Chapter 5.2.2 --- Detection of Missing / Spurious Concepts --- p.92Chapter 5.2.3 --- Context Inheritance --- p.94Chapter 5.3 --- Portability of the BN-based Dialog Model --- p.101Chapter 5.3.1 --- General Principles for Probability Assignment --- p.101Chapter 5.3.2 --- Performance of the BN-based Dialog Model with Hand- Assigned Probabilities --- p.105Chapter 5.3.3 --- Error Analysis --- p.108Chapter 5.4 --- Enhancements for Discourse Query Understanding --- p.110Chapter 5.4.1 --- Combining Trained and Handcrafted Probabilities --- p.110Chapter 5.4.2 --- Handcrafted Topology for BNs --- p.111Chapter 5.4.3 --- Performance of the Enhanced BN-based Dialog Model --- p.117Chapter 5.5 --- Chapter Summary --- p.120Chapter 6 --- Conclusions --- p.122Chapter 6.1 --- Summary --- p.122Chapter 6.2 --- Contributions --- p.126Chapter 6.3 --- Future Work --- p.127Bibliography --- p.129Chapter A --- The Two Original SQL Query --- p.137Chapter B --- "The Two Grammars, GH and GsA" --- p.139Chapter C --- Probability Propagation in Belief Networks --- p.149Chapter C.1 --- Computing the aposteriori probability of P*(G) based on in- put concepts --- p.151Chapter C.2 --- Computing the aposteriori probability of P*(Cj) by backward inference --- p.154Chapter D --- Total 23 Concepts for the Handcrafted BN --- p.15

    Cognitive architecture of multimodal multidimensional dialogue management

    Get PDF
    Numerous studies show that participants of real-life dialogues happen to get involved in rather dynamic non-sequential interactions. This challenges the dialogue system designs based on a reactive interlocutor paradigm and calls for dialog systems that can be characterised as a proactive learner, accomplished multitasking planner and adaptive decision maker. Addressing this call, the thesis brings innovative integration of cognitive models into the human-computer dialogue systems. This work utilises recent advances in Instance-Based Learning of Theory of Mind skills and the established Cognitive Task Analysis and ACT-R models. Cognitive Task Agents, producing detailed simulation of human learning, prediction, adaption and decision making, are integrated in the multi-agent Dialogue Man-ager. The manager operates on the multidimensional information state enriched with representations based on domain- and modality-specific semantics and performs context-driven dialogue acts interpretation and generation. The flexible technical framework for modular distributed dialogue system integration is designed and tested. The implemented multitasking Interactive Cognitive Tutor is evaluated as showing human-like proactive and adaptive behaviour in setting goals, choosing appropriate strategies and monitoring processes across contexts, and encouraging the user exhibit similar metacognitive competences

    Analysis and Design of Speech-Recognition Grammars

    Get PDF
    Currently, most commercial speech-enabled products are constructed using grammar-based technology. Grammar design is a critical issue for good recognition accuracy. Two methods are commonly used for creating grammars: 1) to generate them automatically from a large corpus of input data which is very costly to acquire, or 2) to construct them using an iterative process involving manual design, followed by testing with end-user speech input. This is a time-consuming and very expensive process requiring expert knowledge of language design, as well as the application area. Another hurdle to the creation and use of speech-enabled applications is that expertise is also required to integrate the speech capability with the application code and to deploy the application for wide-scale use. An alternative approach, which we propose, is 1) to construct them using the iterative process described above, but to replace end-user testing by analysis of the recognition grammars using a set of grammar metrics which have been shown to be good indicators of recognition accuracy, 2) to improve recognition accuracy in the design process by encoding semantic constraints in the syntax rules of the grammar, 3) to augment the above process by generating recognition grammars automatically from specifications of the application, and 4) to use tools for creating speech-enabled applications together with an architecture for their deployment which enables expert users, as well as users who do not have expertise in language processing, to easily build speech applications and add them to the web
    corecore