6 research outputs found

    Machine Translation on a parallel Code-Switched Corpus

    Get PDF
    International audienceCode-switching (CS) is the phenomenon that occurs when a speaker alternates between two or more languages within an utterance or discourse. In this work, we investigate the existence of code-switching in formal text, namely proceedings of multilingual institutions. Our study is carried out on the Arabic-English code-mixing in a parallel corpus extracted from official documents of United Nations. We build a parallel code-switched corpus with two reference translations one in pure Arabic and the other in pure English. We also carry out a human evaluation of this resource in the aim to use it to evaluate the translation of code-switched documents. To the best of our knowledge, this kind of corpora does not exist. The one we propose is unique. This paper examines several methods to translate code-switched corpus: conventional statistical machine translation, the end-to-end neural machine translation and multitask-learning

    Machine Translation on a parallel Code-Switched Corpus

    Get PDF
    International audienceCode-switching (CS) is the phenomenon that occurs when a speaker alternates between two or more languages within an utterance or discourse. In this work, we investigate the existence of code-switching in formal text, namely proceedings of multilingual institutions. Our study is carried out on the Arabic-English code-mixing in a parallel corpus extracted from official documents of United Nations. We build a parallel code-switched corpus with two reference translations one in pure Arabic and the other in pure English. We also carry out a human evaluation of this resource in the aim to use it to evaluate the translation of code-switched documents. To the best of our knowledge, this kind of corpora does not exist. The one we propose is unique. This paper examines several methods to translate code-switched corpus: conventional statistical machine translation, the end-to-end neural machine translation and multitask-learning

    The Effect of Alignment Objectives on Code-Switching Translation

    Full text link
    One of the things that need to change when it comes to machine translation is the models' ability to translate code-switching content, especially with the rise of social media and user-generated content. In this paper, we are proposing a way of training a single machine translation model that is able to translate monolingual sentences from one language to another, along with translating code-switched sentences to either language. This model can be considered a bilingual model in the human sense. For better use of parallel data, we generated synthetic code-switched (CSW) data along with an alignment loss on the encoder to align representations across languages. Using the WMT14 English-French (En-Fr) dataset, the trained model strongly outperforms bidirectional baselines on code-switched translation while maintaining quality for non-code-switched (monolingual) data.Comment: This paper was originally submitted on 30/06/202

    Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study

    Full text link
    Code-switching (CSW) text generation has been receiving increasing attention as a solution to address data scarcity. In light of this growing interest, we need more comprehensive studies comparing different augmentation approaches. In this work, we compare three popular approaches: lexical replacements, linguistic theories, and back-translation (BT), in the context of Egyptian Arabic-English CSW. We assess the effectiveness of the approaches on machine translation and the quality of augmentations through human evaluation. We show that BT and CSW predictive-based lexical replacement, being trained on CSW parallel data, perform best on both tasks. Linguistic theories and random lexical replacement prove to be effective in the lack of CSW parallel data, where both approaches achieve similar results.Comment: Findings of EMNLP 202

    Modelling causality in law = Modélisation de la causalité en droit

    Full text link
    L'intérêt en apprentissage machine pour étudier la causalité s'est considérablement accru ces dernières années. Cette approche est cependant encore peu répandue dans le domaine de l’intelligence artificielle (IA) et du droit. Elle devrait l'être. L'approche associative actuelle d’apprentissage machine révèle certaines limites que l'analyse causale peut surmonter. Cette thèse vise à découvrir si les modèles causaux peuvent être utilisés en IA et droit. Nous procédons à une brève revue sur le raisonnement et la causalité en science et en droit. Traditionnellement, les cadres normatifs du raisonnement étaient la logique et la rationalité, mais la théorie duale démontre que la prise de décision humaine dépend de nombreux facteurs qui défient la rationalité. À ce titre, des statistiques et des probabilités étaient nécessaires pour améliorer la prédiction des résultats décisionnels. En droit, les cadres de causalité ont été définis par des décisions historiques, mais la plupart des modèles d’aujourd’hui de l'IA et droit n'impliquent pas d'analyse causale. Nous fournissons un bref résumé de ces modèles, puis appliquons le langage structurel de Judea Pearl et les définitions Halpern-Pearl de la causalité pour modéliser quelques décisions juridiques canadiennes qui impliquent la causalité. Les résultats suggèrent qu'il est non seulement possible d'utiliser des modèles de causalité formels pour décrire les décisions juridiques, mais également utile car un schéma uniforme élimine l'ambiguïté. De plus, les cadres de causalité sont utiles pour promouvoir la responsabilisation et minimiser les biais.The machine learning community’s interest in causality has significantly increased in recent years. This trend has not yet been made popular in AI & Law. It should be because the current associative ML approach reveals certain limitations that causal analysis may overcome. This research paper aims to discover whether formal causal frameworks can be used in AI & Law. We proceed with a brief account of scholarship on reasoning and causality in science and in law. Traditionally, normative frameworks for reasoning have been logic and rationality, but the dual theory has shown that human decision-making depends on many factors that defy rationality. As such, statistics and probability were called for to improve the prediction of decisional outcomes. In law, causal frameworks have been defined by landmark decisions but most of the AI & Law models today do not involve causal analysis. We provide a brief summary of these models and then attempt to apply Judea Pearl’s structural language and the Halpern-Pearl definitions of actual causality to model a few Canadian legal decisions that involve causality. Results suggest that it is not only possible to use formal causal models to describe legal decisions, but also useful because a uniform schema eliminates ambiguity. Also, causal frameworks are helpful in promoting accountability and minimizing biases
    corecore