194 research outputs found
Probing the need for visual context in multimodal machine translation
Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial. We posit that this is a consequence of the very simple, short and repetitive sentences used in the only available dataset for the task (Multi30K), rendering the source text sufficient as context. In the general case, however, we believe that it is possible to combine visual and textual information in order to ground translations. In this paper we probe the contribution of the visual modality to state-of-the-art MMT models by conducting a systematic analysis where we partially deprive the models from source-side textual context. Our results show that under limited textual context, models are capable of leveraging the visual input to generate better translations. This contradicts the current belief that MMT models disregard the visual modality because of either the quality of the image features or the way they are integrated into the model
Incremental Adaptation Strategies for Neural Network Language Models
It is today acknowledged that neural network language models outperform backoff language models in applications like speech recognition or statistical machine translation. However, training these models on large amounts of data can take several days. We present efficient techniques to adapt a neural network language model to new data. Instead of training a completely new model or relying on mixture approaches, we propose two new methods: continued training on resampled data or insertion of adaptation layers. We present experimental results in an CAT environment where the post-edits of professional translators are used to improve an SMT system. Both methods are very fast and achieve significant improvements without overfitting the small adaptation data
Bayesian active learning with pretrained language models
Active Learning (AL) is a method to iteratively select data for annotation from a pool of unlabeled data, aiming to achieve better model performance than random selection. Previous AL approaches in Natural Language Processing (NLP) have been limited to either task-specific models that are trained from scratch at each iteration using only the labeled data at hand or using off-the-shelf pretrained language models (LMs) that are not adapted effectively to the downstream task. In this paper, we address these limitations by introducing BALM; Bayesian Active Learning with pretrained language Models. We first propose to adapt the pretrained LM to the downstream task by continuing training with all the available unlabeled data and then use it for AL. We also suggest a simple yet effective fine-tuning method to ensure that the adapted LM is properly trained in both low and high resource scenarios during AL. We finally apply Monte Carlo dropout to the downstream model to obtain well-calibrated confidence scores for data selection with uncertainty sampling. Our experiments in five standard natural language understanding tasks demonstrate that BALM provides substantial data efficiency improvements compared to various combinations of acquisition functions, models and fine-tuning methods proposed in recent AL literature
LIUM-CVC submissions for WMT17 multimodal translation task
This paper describes the monomodal and multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT17 Shared Task on Multimodal Translation. We mainly explored two multimodal architectures where either global visual features or convolutional feature maps are integrated in order to benefit from visual context. Our final systems ranked first for both En-De and En-Fr language pairs according to the automatic evaluation metrics METEOR and BLEU
Addressing data sparsity for neural machine translation between morphologically rich languages
Translating between morphologically rich languages is still challenging for current machine translation systems. In this paper, we experiment with various neural machine translation (NMT) architectures to address the data sparsity problem caused by data availability (quantity), domain shift and the languages involved (Arabic and French). We show that the Factored NMT (FNMT) model, which uses linguistically motivated factors, is able to outperform standard NMT systems using subword units by more than 1 BLEU point even when a large quantity of data is available. Our work shows the benefits of applying linguistic factors in NMT when faced with low- and high-resource conditions
Active learning by acquiring contrastive examples
Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively. In this work, leveraging the best of both worlds, we propose an acquisition function that opts for selecting contrastive examples, i.e. data points that are similar in the model feature space and yet the model outputs maximally different predictive likelihoods. We compare our approach, CAL (Contrastive Active Learning), with a diverse set of acquisition functions in four natural language
understanding tasks and seven datasets. Our experiments show that CAL performs consistently better or equal than the best performing baseline across all tasks, on both in-domain and out-of-domain data. We also conduct an extensive ablation study of our method and we further analyze all actively acquired datasets showing that CAL achieves a better trade-off between uncertainty and diversity compared to other strategies
Reconstructing Haemodynamics Quantities of Interest from Doppler Ultrasound Imaging
The present contribution deals with the estimation of haemodynamics
Quantities of Interest by exploiting Ultrasound Doppler measurements. A fast
method is proposed, based on the PBDW method. Several methodological
contributions are described: a sub-manifold partitioning is introduced to
improve the reduced-order approximation, two different ways to estimate the
pressure drop are compared, and an error estimation is derived. A test-case on
a realistic common carotid geometry is presented, showing that the proposed
approach is promising in view of realistic applications.Comment: arXiv admin note: text overlap with arXiv:1904.1336
Model Order Reduction for Rotating Electrical Machines
The simulation of electric rotating machines is both computationally
expensive and memory intensive. To overcome these costs, model order reduction
techniques can be applied. The focus of this contribution is especially on
machines that contain non-symmetric components. These are usually introduced
during the mass production process and are modeled by small perturbations in
the geometry (e.g., eccentricity) or the material parameters. While model order
reduction for symmetric machines is clear and does not need special treatment,
the non-symmetric setting adds additional challenges. An adaptive strategy
based on proper orthogonal decomposition is developed to overcome these
difficulties. Equipped with an a posteriori error estimator the obtained
solution is certified. Numerical examples are presented to demonstrate the
effectiveness of the proposed method
Issues in Incremental Adaptation of Statistical MT from Human Post-edits
This work investigates a crucial aspect for the integration of MT technology into a CAT environment, that is the ability of MT systems to adapt from the user feedback. In particular, weconsider the scenario of an MT system tuned for a specific translation project that after each day of work adapts from the post-edited translations created by the user. We apply and compare different state-of-the-art adaptation methods on post-edited translations generated by two professionals during two days of work with a CAT tool embedding MT suggestions. Both translators worked at the same legal document from English into Italian and German, respectively. Although exactly the same amount of translations was available each day for each language, the application of the same adaptation methods resulted in quite different out comes. This suggests that adaptation strategies should not be applied blindly, but rather taking into account language specific issues, such as data sparsity
- âŠ