Search CORE

6 research outputs found

On the Importance of Word Order Information in Cross-lingual Sequence Labeling

Author: Cahyawijaya Samuel
Fung Pascale
Lin Zhaojiang
Liu Zihan
Madotto Andrea
Winata Genta Indra
Publication venue
Publication date: 08/12/2020
Field of study

Word order variances generally exist in different languages. In this paper, we hypothesize that cross-lingual models that fit into the word order of the source language might fail to handle target languages. To verify this hypothesis, we investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages. To do so, we reduce the source language word order information fitted to sequence encoders and observe the performance changes. In addition, based on this hypothesis, we propose a new method for fine-tuning multilingual BERT in downstream cross-lingual sequence labeling tasks. Experimental results on dialogue natural language understanding, part-of-speech tagging, and named entity recognition tasks show that reducing word order information fitted to the model can achieve better zero-shot cross-lingual performance. Furthermore, our proposed methods can also be applied to strong cross-lingual baselines, and improve their performances.Comment: Accepted in AAAI-202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

How Decoding Strategies Affect the Verifiability of Generated Text

Author: Massarelli Luca
Ott Myle
Petroni Fabio
Piktus Aleksandra
Plachouras Vassilis
Riedel Sebastian
Rocktäschel Tim
Silvestri Fabrizio
Publication venue
Publication date: 20/11/2019
Field of study

Recent progress in pre-trained language models led to systems that are able to generate text of an increasingly high quality. While several works have investigated the fluency and grammatical correctness of such models, it is still unclear to which extent the generated text is consistent with factual world knowledge. Here, we go beyond fluency and also investigate the verifiability of text generated by state-of-the-art pre-trained language models. A generated sentence is verifiable if it can be corroborated or disproved by Wikipedia, and we find that the verifiability of generated text strongly depends on the decoding strategy. In particular, we discover a tradeoff between factuality (i.e., the ability of generating Wikipedia corroborated text) and repetitiveness. While decoding strategies such as top-k and nucleus sampling lead to less repetitive generations, they also produce less verifiable text. Based on these finding, we introduce a simple and effective decoding strategy which, in comparison to previously used decoding strategies, produces less repetitive and more verifiable text.Comment: accepted at Findings of EMNLP 202

arXiv.org e-Print Archive

Crossref

UCL Discovery

Archivio della ricerca- Università di Roma La Sapienza

Itzulpen automatiko gainbegiratu gabea

Author: Artexe Zurutuza Mikel
Publication venue
Publication date: 29/07/2020
Field of study

192 p.Modern machine translation relies on strong supervision in the form of parallel corpora. Such arequirement greatly departs from the way in which humans acquire language, and poses a major practicalproblem for low-resource language pairs. In this thesis, we develop a new paradigm that removes thedependency on parallel data altogether, relying on nothing but monolingual corpora to train unsupervisedmachine translation systems. For that purpose, our approach first aligns separately trained wordrepresentations in different languages based on their structural similarity, and uses them to initializeeither a neural or a statistical machine translation system, which is further trained through iterative backtranslation.While previous attempts at learning machine translation systems from monolingual corporahad strong limitations, our work¿along with other contemporaneous developments¿is the first to reportpositive results in standard, large-scale settings, establishing the foundations of unsupervised machinetranslation and opening exciting opportunities for future research

Archivo Digital para la Docencia y la Investigación