1,711 research outputs found
Adaptation of machine translation models with back-translated data using transductive data selection methods
Data selection has proven its merit for improving Neural Machine Translation (NMT), when applied to authentic data. But the beneļ¬t of using synthetic data in NMT training, produced by the popular back-translation technique, raises the question if data selection could also be useful for synthetic data? In this work we use Infrequent n-gram Recovery (INR) and Feature Decay Algorithms (FDA), two transductive data selection methods to obtain subsets of sentences from synthetic data. These methods ensure that selected sentences share n-grams with the test set so the NMT model can be adapted to translate it. Performing data selection on back-translated data creates new challenges as the source-side may contain noise originated by the model used in the back-translation. Hence, ļ¬nding ngrams present in the test set become more diļ¬cult. Despite that, in our work we show that adapting a model with a selection of synthetic data is an useful approach
Feature decay algorithms for neural machine translation
Neural Machine Translation (NMT) systems require a lot of data to be competitive. For this reason, data selection techniques are used only for finetuning systems that have been trained with larger amounts of data. In this work we aim to use Feature Decay Algorithms (FDA) data selection techniques not only to fine-tune a system but also to build a complete system with less data. Our findings reveal that it is possible to find a subset of sentence pairs, that outperforms by 1.11 BLEU points the full training corpus, when used for training a German-English NMT system
SphereFed: Hyperspherical Federated Learning
Federated Learning aims at training a global model from multiple
decentralized devices (i.e. clients) without exchanging their private local
data. A key challenge is the handling of non-i.i.d. (independent identically
distributed) data across multiple clients that may induce disparities of their
local features. We introduce the Hyperspherical Federated Learning (SphereFed)
framework to address the non-i.i.d. issue by constraining learned
representations of data points to be on a unit hypersphere shared by clients.
Specifically, all clients learn their local representations by minimizing the
loss with respect to a fixed classifier whose weights span the unit
hypersphere. After federated training in improving the global model, this
classifier is further calibrated with a closed-form solution by minimizing a
mean squared loss. We show that the calibration solution can be computed
efficiently and distributedly without direct access of local data. Extensive
experiments indicate that our SphereFed approach is able to improve the
accuracy of multiple existing federated learning algorithms by a considerable
margin (up to 6% on challenging datasets) with enhanced computation and
communication efficiency across datasets and model architectures.Comment: European Conference on Computer Vision 202
- ā¦