1,371 research outputs found

    Non-linear Learning for Statistical Machine Translation

    Full text link
    Modern statistical machine translation (SMT) systems usually use a linear combination of features to model the quality of each translation hypothesis. The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner, which might limit the expressive power of the model and lead to a under-fit model on the current data. In this paper, we propose a non-linear modeling for the quality of translation hypotheses based on neural networks, which allows more complex interaction between features. A learning framework is presented for training the non-linear models. We also discuss possible heuristics in designing the network structure which may improve the non-linear learning performance. Experimental results show that with the basic features of a hierarchical phrase-based machine translation system, our method produce translations that are better than a linear model.Comment: submitted to a conferenc

    Channel characterization for 1D molecular communication with two absorbing receivers

    Get PDF
    This letter develops a one-dimensional (1D) diffusion-based molecular communication system to analyze channel responses between a single transmitter (TX) and two fully-absorbing receivers (RXs). Incorporating molecular degradation in the environment, rigorous analytical formulas for i) the fraction of molecules absorbed, ii) the corresponding hitting rate, and iii) the asymptotic fraction of absorbed molecules as time approaches infinity at each RX are derived when an impulse of molecules are released at the TX. By using particle-based simulations, the derived analytical expressions are validated. Simulations also present the distance ranges of two RXs that do not impact molecular absorption of each other, and demonstrate that the mutual influence of two active RXs reduces with the increase in the degradation rate

    Parameter Estimation in a Noisy 1D Environment via Two Absorbing Receivers

    Full text link
    This paper investigates the estimation of different parameters, e.g., propagation distance and flow velocity, by utilizing two fully-absorbing receivers (RXs) in a one-dimensional (1D) environment. The time-varying number of absorbed molecules at each RX and the number of absorbed molecules in a time interval as time approaches infinity are derived. Noisy molecules in this environment, that are released by sources in addition to the transmitter, are also considered. A novel estimation method, namely difference estimation (DE), is proposed to eliminate the effect of noise by using the difference of received signals at the two RXs. For DE, the Cramer-Rao lower bound (CRLB) on the variance of estimation is derived. Independent maximum likelihood estimation is also considered at each RX as a benchmark to show the performance advantage of DE. Aided by particle-based simulation, the derived analytical results are verified. Furthermore, numerical results show that DE attains the CRLB and is less sensitive to the change of noise than independent estimation at each RX.Comment: 7 pages, 5 figures, accepted by Globecom 202

    Neural Machine Translation with Word Predictions

    Full text link
    In the encoder-decoder architecture for neural machine translation (NMT), the hidden states of the recurrent structures in the encoder and decoder carry the crucial information about the sentence.These vectors are generated by parameters which are updated by back-propagation of translation errors through time. We argue that propagating errors through the end-to-end recurrent structures are not a direct way of control the hidden vectors. In this paper, we propose to use word predictions as a mechanism for direct supervision. More specifically, we require these vectors to be able to predict the vocabulary in target sentence. Our simple mechanism ensures better representations in the encoder and decoder without using any extra data or annotation. It is also helpful in reducing the target side vocabulary and improving the decoding efficiency. Experiments on Chinese-English and German-English machine translation tasks show BLEU improvements by 4.53 and 1.3, respectivelyComment: Accepted at EMNLP201

    Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation

    Full text link
    Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing(NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list's ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.Comment: Accepted to CONLL 201
    • …
    corecore