7,125 research outputs found

    Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System

    Full text link
    [EN] Neural Network Language Models (NNLMs) are a successful approach to Natural Language Processing tasks, such as Machine Translation. We introduce in this work a Statistical Machine Translation (SMT) system which fully integrates NNLMs in the decoding stage, breaking the traditional approach based on n-best list rescoring. The neural net models (both language models (LMs) and translation models) are fully coupled in the decoding stage, allowing to more strongly influence the translation quality. Computational issues were solved by using a novel idea based on memorization and smoothing of the softmax constants to avoid their computation, which introduces a trade-off between LM quality and computational cost. These ideas were studied in a machine translation task with different combinations of neural networks used both as translation models and as target LMs, comparing phrase-based and N-gram-based systems, showing that the integrated approach seems more promising for N-gram-based systems, even with nonfull-quality NNLMs.This work was partially supported by the Spanish MINECO and FEDER found under project TIN2017-85854-C4-2-R.Zamora Martínez, FJ.; Castro-Bleda, MJ. (2018). Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System. International Journal of Neural Systems. 28(9). https://doi.org/10.1142/S0129065718500077S28

    Unfolding and Shrinking Neural Machine Translation Ensembles

    Full text link
    Ensembling is a well-known technique in neural machine translation (NMT) to improve system performance. Instead of a single neural net, multiple neural nets with the same topology are trained separately, and the decoder generates predictions by averaging over the individual models. Ensembling often improves the quality of the generated translations drastically. However, it is not suitable for production systems because it is cumbersome and slow. This work aims to reduce the runtime to be on par with a single system without compromising the translation quality. First, we show that the ensemble can be unfolded into a single large neural network which imitates the output of the ensemble system. We show that unfolding can already improve the runtime in practice since more work can be done on the GPU. We proceed by describing a set of techniques to shrink the unfolded network by reducing the dimensionality of layers. On Japanese-English we report that the resulting network has the size and decoding speed of a single NMT network but performs on the level of a 3-ensemble system.Comment: Accepted at EMNLP 201

    Deep Semantic Role Labeling with Self-Attention

    Full text link
    Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F1=83.4_1=83.4 on the CoNLL-2005 shared task dataset and F1=82.7_1=82.7 on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by 1.81.8 and 1.01.0 F1_1 score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.Comment: Accepted by AAAI-201
    • …
    corecore