103 research outputs found

    Transfer Learning for Speech and Language Processing

    Full text link
    Transfer learning is a vital technique that generalizes models trained for one setting or task to other settings or tasks. For example in speech recognition, an acoustic model trained for one language can be used to recognize speech in another language, with little or no re-training data. Transfer learning is closely related to multi-task learning (cross-lingual vs. multilingual), and is traditionally studied in the name of `model adaptation'. Recent advance in deep learning shows that transfer learning becomes much easier and more effective with high-level abstract features learned by deep models, and the `transfer' can be conducted not only between data distributions and data types, but also between model structures (e.g., shallow nets and deep nets) or even model types (e.g., Bayesian models and neural models). This review paper summarizes some recent prominent research towards this direction, particularly for speech and language processing. We also report some results from our group and highlight the potential of this very interesting research field.Comment: 13 pages, APSIPA 201

    XL-NBT: A Cross-lingual Neural Belief Tracking Framework

    Full text link
    Task-oriented dialog systems are becoming pervasive, and many companies heavily rely on them to complement human agents for customer service in call centers. With globalization, the need for providing cross-lingual customer support becomes more urgent than ever. However, cross-lingual support poses great challenges---it requires a large amount of additional annotated data from native speakers. In order to bypass the expensive human annotation and achieve the first step towards the ultimate goal of building a universal dialog system, we set out to build a cross-lingual state tracking framework. Specifically, we assume that there exists a source language with dialog belief tracking annotations while the target languages have no annotated dialog data of any form. Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data. We then distill and transfer its own knowledge to the student state tracker in target languages. We specifically discuss two types of common parallel resources: bilingual corpus and bilingual dictionary, and design different transfer learning strategies accordingly. Experimentally, we successfully use English state tracker as the teacher to transfer its knowledge to both Italian and German trackers and achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc

    PersoNER: Persian named-entity recognition

    Full text link
    © 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

    Reordering in statistical machine translation

    Get PDF
    PhDMachine translation is a challenging task that its difficulties arise from several characteristics of natural language. The main focus of this work is on reordering as one of the major problems in MT and statistical MT, which is the method investigated in this research. The reordering problem in SMT originates from the fact that not all the words in a sentence can be consecutively translated. This means words must be skipped and be translated out of their order in the source sentence to produce a fluent and grammatically correct sentence in the target language. The main reason that reordering is needed is the fundamental word order differences between languages. Therefore, reordering becomes a more dominant issue, the more source and target languages are structurally different. The aim of this thesis is to study the reordering phenomenon by proposing new methods of dealing with reordering in SMT decoders and evaluating the effectiveness of the methods and the importance of reordering in the context of natural language processing tasks. In other words, we propose novel ways of performing the decoding to improve the reordering capabilities of the SMT decoder and in addition we explore the effect of improving the reordering on the quality of specific NLP tasks, namely named entity recognition and cross-lingual text association. Meanwhile, we go beyond reordering in text association and present a method to perform cross-lingual text fragment alignment, based on models of divergence from randomness. The main contribution of this thesis is a novel method named dynamic distortion, which is designed to improve the ability of the phrase-based decoder in performing reordering by adjusting the distortion parameter based on the translation context. The model employs a discriminative reordering model, which is combining several fea- 2 tures including lexical and syntactic, to predict the necessary distortion limit for each sentence and each hypothesis expansion. The discriminative reordering model is also integrated into the decoder as an extra feature. The method achieves substantial improvements over the baseline without increase in the decoding time by avoiding reordering in unnecessary positions. Another novel method is also presented to extend the phrase-based decoder to dynamically chunk, reorder, and apply phrase translations in tandem. Words inside the chunks are moved together to enable the decoder to make long-distance reorderings to capture the word order differences between languages with different sentence structures. Another aspect of this work is the task-based evaluation of the reordering methods and other translation algorithms used in the phrase-based SMT systems. With more successful SMT systems, performing multi-lingual and cross-lingual tasks through translating becomes more feasible. We have devised a method to evaluate the performance of state-of-the art named entity recognisers on the text translated by a SMT decoder. Specifically, we investigated the effect of word reordering and incorporating reordering models in improving the quality of named entity extraction. In addition to empirically investigating the effect of translation in the context of crosslingual document association, we have described a text fragment alignment algorithm to find sections of the two documents in different languages, that are content-wise related. The algorithm uses similarity measures based on divergence from randomness and word-based translation models to perform text fragment alignment on a collection of documents in two different languages. All the methods proposed in this thesis are extensively empirically examined. We have tested all the algorithms on common translation collections used in different evaluation campaigns. Well known automatic evaluation metrics are used to compare the suggested methods to a state-of-the art baseline and results are analysed and discussed

    Multilingual Spoken Language Understanding using graphs and multiple translations

    Full text link
    This is the author’s version of a work that was accepted for publication in Computer Speech and Language. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Speech and Language, vol. 38 (2016). DOI 10.1016/j.csl.2016.01.002.In this paper, we present an approach to multilingual Spoken Language Understanding based on a process of generalization of multiple translations, followed by a specific methodology to perform a semantic parsing of these combined translations. A statistical semantic model, which is learned from a segmented and labeled corpus, is used to represent the semantics of the task in a language. Our goal is to allow the users to interact with the system using other languages different from the one used to train the semantic models, avoiding the cost of segmenting and labeling a training corpus for each language. In order to reduce the effect of translation errors and to increase the coverage, we propose an algorithm to generate graphs of words from different translations. We also propose an algorithm to parse graphs of words with the statistical semantic model. The experimental results confirm the good behavior of this approach using French and English as input languages in a spoken language understanding task that was developed for Spanish. (C) 2016 Elsevier Ltd. All rights reserved.This work is partially supported by the Spanish MEC under contract TIN2014-54288-C4-3-R and by the Spanish MICINN under FPU Grant AP2010-4193.Calvo Lance, M.; Hurtado Oliver, LF.; García-Granada, F.; Sanchís Arnal, E.; Segarra Soriano, E. (2016). Multilingual Spoken Language Understanding using graphs and multiple translations. Computer Speech and Language. 38:86-103. https://doi.org/10.1016/j.csl.2016.01.002S861033

    Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers

    Full text link
    Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems are unable to achieve improved performance in cross-language settings. In this paper, we propose a Multimodal Dual Attention Transformer (MDAT) model to improve cross-language SER. Our model utilises pre-trained models for multimodal feature extraction and is equipped with a dual attention mechanism including graph attention and co-attention to capture complex dependencies across different modalities and achieve improved cross-language SER results using minimal target language data. In addition, our model also exploits a transformer encoder layer for high-level feature representation to improve emotion classification accuracy. In this way, MDAT performs refinement of feature representation at various stages and provides emotional salient features to the classification layer. This novel approach also ensures the preservation of modality-specific emotional information while enhancing cross-modality and cross-language interactions. We assess our model's performance on four publicly available SER datasets and establish its superior effectiveness compared to recent approaches and baseline models.Comment: Under Review IEEE TM

    Translation Alignment Applied to Historical Languages: methods, evaluation, applications, and visualization

    Get PDF
    Translation alignment is an essential task in Digital Humanities and Natural Language Processing, and it aims to link words/phrases in the source text with their translation equivalents in the translation. In addition to its importance in teaching and learning historical languages, translation alignment builds bridges between ancient and modern languages through which various linguistics annotations can be transferred. This thesis focuses on word-level translation alignment applied to historical languages in general and Ancient Greek and Latin in particular. As the title indicates, the thesis addresses four interdisciplinary aspects of translation alignment. The starting point was developing Ugarit, an interactive annotation tool to perform manual alignment aiming to gather training data to train an automatic alignment model. This effort resulted in more than 190k accurate translation pairs that I used for supervised training later. Ugarit has been used by many researchers and scholars also in the classroom at several institutions for teaching and learning ancient languages, which resulted in a large, diverse crowd-sourced aligned parallel corpus allowing us to conduct experiments and qualitative analysis to detect recurring patterns in annotators’ alignment practice and the generated translation pairs. Further, I employed the recent advances in NLP and language modeling to develop an automatic alignment model for historical low-resourced languages, experimenting with various training objectives and proposing a training strategy for historical languages that combines supervised and unsupervised training with mono- and multilingual texts. Then, I integrated this alignment model into other development workflows to project cross-lingual annotations and induce bilingual dictionaries from parallel corpora. Evaluation is essential to assess the quality of any model. To ensure employing the best practice, I reviewed the current evaluation procedure, defined its limitations, and proposed two new evaluation metrics. Moreover, I introduced a visual analytics framework to explore and inspect alignment gold standard datasets and support quantitative and qualitative evaluation of translation alignment models. Besides, I designed and implemented visual analytics tools and reading environments for parallel texts and proposed various visualization approaches to support different alignment-related tasks employing the latest advances in information visualization and best practice. Overall, this thesis presents a comprehensive study that includes manual and automatic alignment techniques, evaluation methods and visual analytics tools that aim to advance the field of translation alignment for historical languages

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    Practical Natural Language Processing for Low-Resource Languages.

    Full text link
    As the Internet and World Wide Web have continued to gain widespread adoption, the linguistic diversity represented has also been growing. Simultaneously the field of Linguistics is facing a crisis of the opposite sort. Languages are becoming extinct faster than ever before and linguists now estimate that the world could lose more than half of its linguistic diversity by the year 2100. This is a special time for Computational Linguistics; this field has unprecedented access to a great number of low-resource languages, readily available to be studied, but needs to act quickly before political, social, and economic pressures cause these languages to disappear from the Web. Most work in Computational Linguistics and Natural Language Processing (NLP) focuses on English or other languages that have text corpora of hundreds of millions of words. In this work, we present methods for automatically building NLP tools for low-resource languages with minimal need for human annotation in these languages. We start first with language identification, specifically focusing on word-level language identification, an understudied variant that is necessary for processing Web text and develop highly accurate machine learning methods for this problem. From there we move onto the problems of part-of-speech tagging and dependency parsing. With both of these problems we extend the current state of the art in projected learning to make use of multiple high-resource source languages instead of just a single language. In both tasks, we are able to improve on the best current methods. All of these tools are practically realized in the "Minority Language Server," an online tool that brings these techniques together with low-resource language text on the Web. The Minority Language Server, starting with only a few words in a language can automatically collect text in a language, identify its language and tag its parts of speech. We hope that this system is able to provide a convincing proof of concept for the automatic collection and processing of low-resource language text from the Web, and one that can hopefully be realized before it is too late.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113373/1/benking_1.pd
    corecore