Search CORE

1,711 research outputs found

Adaptation of machine translation models with back-translated data using transductive data selection methods

Author: Maillette de Buy Wenniger Gideon
Poncelas Alberto
Way Andy
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2019
Field of study

Data selection has proven its merit for improving Neural Machine Translation (NMT), when applied to authentic data. But the beneﬁt of using synthetic data in NMT training, produced by the popular back-translation technique, raises the question if data selection could also be useful for synthetic data? In this work we use Infrequent n-gram Recovery (INR) and Feature Decay Algorithms (FDA), two transductive data selection methods to obtain subsets of sentences from synthetic data. These methods ensure that selected sentences share n-grams with the test set so the NMT model can be adapted to translate it. Performing data selection on back-translated data creates new challenges as the source-side may contain noise originated by the model used in the back-translation. Hence, ﬁnding ngrams present in the test set become more diﬃcult. Despite that, in our work we show that adapting a model with a selection of synthetic data is an useful approach

arXiv.org e-Print Archive

Feature decay algorithms for neural machine translation

Author: Maillette de Buy Wenniger Gideon
Poncelas Alberto
Way Andy
Publication venue
Publication date: 01/01/2018
Field of study

Neural Machine Translation (NMT) systems require a lot of data to be competitive. For this reason, data selection techniques are used only for finetuning systems that have been trained with larger amounts of data. In this work we aim to use Feature Decay Algorithms (FDA) data selection techniques not only to fine-tune a system but also to build a complete system with less data. Our findings reveal that it is possible to find a subset of sentence pairs, that outperforms by 1.11 BLEU points the full training corpus, when used for training a German-English NMT system

SphereFed: Hyperspherical Federated Learning

Author: Dong Xin
Kung H. T.
Li Ang
Zhang Sai Qian
Publication venue
Publication date: 19/07/2022
Field of study

Federated Learning aims at training a global model from multiple decentralized devices (i.e. clients) without exchanging their private local data. A key challenge is the handling of non-i.i.d. (independent identically distributed) data across multiple clients that may induce disparities of their local features. We introduce the Hyperspherical Federated Learning (SphereFed) framework to address the non-i.i.d. issue by constraining learned representations of data points to be on a unit hypersphere shared by clients. Specifically, all clients learn their local representations by minimizing the loss with respect to a fixed classifier whose weights span the unit hypersphere. After federated training in improving the global model, this classifier is further calibrated with a closed-form solution by minimizing a mean squared loss. We show that the calibration solution can be computed efficiently and distributedly without direct access of local data. Extensive experiments indicate that our SphereFed approach is able to improve the accuracy of multiple existing federated learning algorithms by a considerable margin (up to 6% on challenging datasets) with enhanced computation and communication efficiency across datasets and model architectures.Comment: European Conference on Computer Vision 202

arXiv.org e-Print Archive

Strategies for effective utilization of training data for machine translation

Author: Dakwale P.
Publication venue
Publication date: 01/01/2020
Field of study