47 research outputs found

    Zero-shot Multi-Domain Dialog State Tracking Using Descriptive Rules

    Get PDF
    In this work, we present a framework for incorporating descriptive logical rules in state-of-the-art neural networks, enabling them to learn how to handle unseen labels without the introduction of any new training data. The rules are integrated into existing networks without modifying their architecture, through an additional term in the network’s loss function that penalizes states of the network that do not obey the designed rules.As a case of study, the framework is applied to an existing neuralbased Dialog State Tracker. Our experiments demonstrate that the inclusion of logical rules allows the prediction of unseen labels, without deteriorating the predictive capacity of the original system.Fil: Altszyler Lemcovich, Edgar Jaim. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Brusco, Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Basiou, Nikoletta. Sri International; Estados UnidosFil: Byrnes, John. Sri International; Estados UnidosFil: Vergyri, Dimitra. Sri International; Estados Unido

    Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition

    No full text
    Automatic recognition of Arabic dialectal speech is a challenging task because Arabic dialects are essentially spoken varieties. Only few dialectal resources are available to date; moreover, most available acoustic data collections are transcribed without diacritics. Such a transcription omits essential pronunciation information about a word, such as short vowels. In this paper we investigate various procedures that enable us to use such training data by automatically inserting the missing diacritics into the transcription. These procedures use acoustic information in combination with different levels of morphological and contextual constraints. We evaluate their performance against manually diacritized transcriptions. In addition, we demonstrate the effect of their accuracy on the recognition performance of acoustic models trained on automatically diacritized training data.

    Morphology-based language modeling for Arabic speech recognition

    No full text
    dverg,stolcke£ Language modeling is a difficult problem for languages with rich morphology. In this paper we investigate the use of morphology-based language models at different stages in a speech recognition system for conversational Arabic. Classbased and single-stream factored language models using morphological word representations are applied within an N-best list rescoring framework. In addition, we explore the use of factored language models in first-pass recognition, which is facilitated by two novel procedures: the data-driven optimization of a multi-stream language model structure, and the conversion of a factored language model to a standard word-based model. We evaluate these techniques on a large-vocabulary recognition task and demonstrate that they lead to perplexity and word error rate reductions. 1

    Speech T)TYt=3p(f1:K Recognition tjf1:K t?1;f1:K

    No full text
    Morphology-Based Language Modeling p(f1:K fo

    EXPLOITING USER FEEDBACK FOR LANGUAGE MODEL ADAPTATION IN MEETING RECOGNITION

    No full text
    We investigate language model (LM) adaptation in a meeting recognition application, where the LM is adapted based on recognition output from relevant prior meetings and partial manual corrections. Unlike previous work, which has considered either completely unsupervised or supervised adaptation, we investigate a scenario where a human (e.g., a meeting participant) can correct some of the recognition mistakes. We find that recognition accuracy using the adapted LM can be enhanced substantially by partial correction. In particular, if all content words (about half of all recognition errors) are corrected, recognition improves to the same accuracy as if completely error-free (manually created) transcriptions had been used for adaptation. We also compare and combine a variety of adaptation methods, including linear interpolation, unigram marginal adaptation, and a discriminative method based on “positive ” and “negative” N-grams. Index Terms — speech processing, language modeling, meeting recognition, unsupervised adaptation, user feedback
    corecore