Search CORE

695 research outputs found

A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

Author: Bisazza Arianna
Federico Marcello
Publication venue: 'MIT Press - Journals'
Publication date: 14/03/2016
Field of study

Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Dual contextual module for neural machine translation

Author: Ampomah Isaac
Hawe Glenn
McClean Sally I
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/10/2021
Field of study

Ulster University's Research Portal

Speech recognition for smart homes

Author: McLoughlin Ian Vince
Sharifzadeh Hamid Reza
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Kent Academic Repository

A Survey of Word Reordering in Statistical Machine Translation:Computational Models and Language Phenomena

Author: Bisazza A.
Federico M.
Publication venue: 'MIT Press - Journals'
Publication date: 01/06/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Neuroverkkopohjainen faktoidikysymyksiin vastaaminen ja kysymysten generointi suomen kielellä

Author: Kylliäinen Ilmari
Publication venue: Helsingfors universitet
Publication date: 01/01/2022
Field of study

Automaattinen kysymyksiin vastaaminen ja kysymysten generointi ovat kaksi tiiviisti toisiinsa liittyvää luonnollisen kielen käsittelyn tehtävää. Molempia tehtäviä on tutkittu useiden vuosikymmenten ajan ja niillä on useita käyttökohteita. Järjestelmät, jotka osaavat vastata luonnollisella kielellä muodostettuihin kysymyksiin toimivat apuna ihmisten informaatiotarpeissa, kun taas automaattista kysymysten generointia voidaan hyödyntää muun muassa luetunymmärtämistehtävien automaattisessa luomisessa sekä virtuaaliassistenttien interaktiivisuuden parantamisessa. Sekä kysymyksiin vastaamisessa että niiden generoinnissa parhaat tulokset saadaan tällä hetkellä hyödyntämällä esikoulutettuja, transformer-arkkitehtuuriin pohjautuvia neuraalisia kielimalleja. Tällaiset mallit tyypillisesti ensin esikoulutetaan raa’alla kielidatalla ja sitten hienosäädetään erilaisiin tehtäviin käyttäen tehtäväkohtaisia annotoituja aineistoja. Malleja, jotka osaavat vastata suomenkielisiin kysymyksiin tai generoida niitä, ei ole tähän mennessä raportoitu juurikaan olevan olemassa. Jotta niitä voitaisiin luoda moderneja transformer-arkkitehtuuriin perustuvia menetelmiä käyttäen, tarvitaan sekä esikoulutettu kielimalli että tarpeeksi suuri määrä suomenkielistä dataa, joka soveltuu esikoulutettujen mallien hienosäätämiseen juuri kysymyksiin vastaamiseen tai generointiin. Vaikka sekä puhtaasti suomen kielellä esikoulutettuja yksikielisiä malleja että osittain suomen kielellä esikoulutettuja monikielisiä malleja onkin jo jonkin verran avoimesti saatavilla, ongelmaksi muodostuu hienosäätöön tarvittavan datan puuttuminen. Tässä tutkielmassa luodaan ensimmäiset suomenkieliset transformer-arkkitehtuuriin pohjautuvat kysymyksiin vastaamiseen ja kysymysten generointiin hienosäädetyt neuroverkkomallit. Esittelen menetelmän, jolla pyritään luomaan aineisto, joka soveltuu esikoulutettujen mallien hienosäätämiseen molempiin edellä mainittuihin tehtäviin. Aineiston luonti perustuu olemassa olevan englanninkielisen SQuAD-aineiston koneelliseen kääntämiseen sekä käännöksen jälkeisten automaattisten normalisointimenetelmien käyttöön. Hienosäädän luodun aineiston avulla useita esikoulutettuja malleja suomenkieliseen kysymyksiin vastaamiseen ja kysymysten generointiin, sekä vertailen niiden suorituskykyä. Käytän sekä puhtaasti suomen kielellä esikoulutettuja BERT- ja GPT-2-malleja että yhtä monikielisellä aineistolla esikoulutettua BERT-mallia. Tulokset osoittavat, että transformer-arkkitehtuuri soveltuu hyvin myös suomenkieliseen kysymyksiin vastaamiseen ja kysymysten generointiin. Synteettisesti luotu aineisto on tulosten perusteella käyttökelpoinen resurssi esikoulutettujen mallien hienosäätämiseen. Parhaat tulokset molemmissa tehtävissä tuottavat hienosäädetyt BERT-mallit, jotka on esikoulutettu ainoastaan suomenkielisellä kieliaineistolla. Monikielisen BERT:n tulokset ovat lähes yhtä hyviä molemmissa tehtävissä, kun taas GPT-2-mallien tulokset ovat reilusti huonompia.Automatic question answering and question generation are two closely related natural language processing tasks. They both have been studied for decades, and both have a wide range of uses. While systems that can answer questions formed in natural language can help with all kinds of information needs, automatic question generation can be used, for example, to automatically create reading comprehension tasks and improve the interactivity of virtual assistants. These days, the best results in both question answering and question generation are obtained by utilizing pre-trained neural language models based on the transformer architecture. Such models are typically first pre-trained with raw language data and then fine-tuned for various tasks using task-specific annotated datasets. So far, no models that can answer or generate questions purely in Finnish have been reported. In order to create them using modern transformer-based methods, both a pre-trained language model and a sufficiently big dataset suitable for question answering or question generation fine-tuning are required. Although some suitable models that have been pre-trained with Finnish or multilingual data are already available, a big bottleneck is the lack of annotated data needed for fine-tuning the models. In this thesis, I create the first transformer-based neural network models for Finnish question answering and question generation. I present a method for creating a dataset for fine-tuning pre-trained models for the two tasks. The dataset creation is based on automatic translation of an existing dataset (SQuAD) and automatic normalization of the translated data. Using the created dataset, I fine-tune several pre-trained models to answer and generate questions in Finnish and evaluate their performance. I use monolingual BERT and GPT-2 models as well as a multilingual BERT model. The results show that the transformer architecture is well suited also for Finnish question answering and question generation. They also indicate that the synthetically generated dataset can be a useful fine-tuning resource for these tasks. The best results in both tasks are obtained by fine-tuned BERT models which have been pre-trained with only Finnish data. The fine-tuned multilingual BERT models come in close, whereas fine-tuned GPT-2 models are generally found to underperform. The data developed for this thesis will be released to the research community to support future research on question answering and generation, and the models will be released as benchmarks

Helsingin yliopiston digitaalinen arkisto

Recommended from our members

Domain adaptation for neural machine translation

Author: Saunders Danielle
Publication venue: University of Cambridge
Publication date: 01/11/2020
Field of study

The development of deep learning techniques has allowed Neural Machine Translation (NMT) models to become extremely powerful, given sufficient training data and training time. However, such translation models struggle when translating text of a specific domain. A domain may consist of text on a well-defined topic, or text of unknown provenance with an identifiable vocabulary distribution, or language with some other stylometric feature. While NMT models can achieve good translation performance on domain-specific data via simple tuning on a representative training corpus, such data-centric approaches have negative side-effects. These include over-fitting, brittleness, and `catastrophic forgetting' of previous training examples. In this thesis we instead explore more robust approaches to domain adaptation for NMT. We consider the case where a system is adapted to a specified domain of interest, but may also need to accommodate new language, or domain-mismatched sentences. We explore techniques relating to data selection and curriculum, model parameter adaptation procedure, and inference procedure. We show that iterative fine-tuning can achieve strong performance over multiple related domains, and that Elastic Weight Consolidation can be used to mitigate catastrophic forgetting in NMT domain adaptation across multiple sequential domains. We develop a robust variant of Minimum Risk Training which allows more beneficial use of small, highly domain-specific tuning sets than simple cross-entropy fine-tuning, and can mitigate exposure bias resulting from domain over-fitting. We extend Bayesian Interpolation inference schemes to Neural Machine Translation, allowing adaptive weighting of NMT ensembles to translate text from an unknown domain. Finally we demonstrate the benefit of multi-domain adaptation approaches for other lines of NMT research. We show that NMT systems using multiple forms of data representation can benefit from multi-domain inference approaches. We also demonstrate a series of domain adaptation approaches to mitigating the effects of gender bias in machine translation

Apollo (Cambridge)

Semantic Parsing in Limited Resource Conditions

Author: Li Zhuang
Publication venue
Publication date: 14/09/2023
Field of study

This thesis explores challenges in semantic parsing, specifically focusing on scenarios with limited data and computational resources. It offers solutions using techniques like automatic data curation, knowledge transfer, active learning, and continual learning. For tasks with no parallel training data, the thesis proposes generating synthetic training examples from structured database schemas. When there is abundant data in a source domain but limited parallel data in a target domain, knowledge from the source is leveraged to improve parsing in the target domain. For multilingual situations with limited data in the target languages, the thesis introduces a method to adapt parsers using a limited human translation budget. Active learning is applied to select source-language samples for manual translation, maximizing parser performance in the target language. In addition, an alternative method is also proposed to utilize machine translation services, supplemented by human-translated data, to train a more effective parser. When computational resources are limited, a continual learning approach is introduced to minimize training time and computational memory. This maintains the parser's efficiency in previously learned tasks while adapting it to new tasks, mitigating the problem of catastrophic forgetting. Overall, the thesis provides a comprehensive set of methods to improve semantic parsing in resource-constrained conditions.Comment: PhD thesis, year of award 2023, 172 page

arXiv.org e-Print Archive

DeepCause: Hypothesis Extraction from Information Systems Papers with Deep Learning for Theory Ontology Learning

Author: Abdullaev Sardor
Mueller Roland
Publication venue: AIS Electronic Library (AISeL)
Publication date: 08/01/2019
Field of study

This paper applies different deep learning architectures for sequence labelling to extract causes, effects, moderators, and mediators from hypotheses of information systems papers for theory ontology learning. We compared a variety of recurrent neural networks (RNN) architectures, like long short-term memory (LSTM), bidirectional LSTM (BiLSTM), simple RNNs, and gated recurrent units (GRU). We analyzed GloVe word embedding, character level vector representation of words, and part-of-speech (POS) tags. Furthermore, we evaluated various hyperparameters and architectures to achieve the highest performance scores. The prototype was evaluated on hypotheses from the AIS basket of eight. The F1 result for the sequence labelling task of causal variables on a chunk level was 80%, with a precision of 80% and a recall of 80%

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)