9,643 research outputs found

    Modeling Target-Side Inflection in Neural Machine Translation

    Full text link
    NMT systems have problems with large vocabulary sizes. Byte-pair encoding (BPE) is a popular approach to solving this problem, but while BPE allows the system to generate any target-side word, it does not enable effective generalization over the rich vocabulary in morphologically rich languages with strong inflectional phenomena. We introduce a simple approach to overcome this problem by training a system to produce the lemma of a word and its morphologically rich POS tag, which is then followed by a deterministic generation step. We apply this strategy for English-Czech and English-German translation scenarios, obtaining improvements in both settings. We furthermore show that the improvement is not due to only adding explicit morphological information.Comment: Accepted as a research paper at WMT17. (Updated version with corrected references.

    Confidence bands for densities, logarithmic point of view

    Full text link
    Let ff be a probability density and CC be an interval on which ff is bounded away from zero. By establishing the limiting distribution of the uniform error of the kernel estimates fnf_n of ff, Bickel and Rosenblatt (1973) provide confidence bands BnB_n for ff on CC with asymptotic level 1−α∈]0,1[1-\alpha\in]0,1[. Each of the confidence intervals whose union gives BnB_n has an asymptotic level equal to one; pointwise moderate deviations principles allow to prove that all these intervals share the same logarithmic asymptotic level. Now, as soon as both pointwise and uniform moderate deviations principles for fnf_n exist, they share the same asymptotics. Taking this observation as a starting point, we present a new approach for the construction of confidence bands for ff, based on the use of moderate deviations principles. The advantages of this approach are the following: (i) it enables to construct confidence bands, which have the same width (or even a smaller width) as the confidence bands provided by Bickel and Rosenblatt (1973), but which have a better aymptotic level; (ii) any confidence band constructed in that way shares the same logarithmic asymptotic level as all the confidence intervals, which make up this confidence band; (iii) it allows to deal with all the dimensions in the same way; (iv) it enables to sort out the problem of providing confidence bands for ff on compact sets on which ff vanishes (or on all \bb R^d), by introducing a truncating operation

    Kernel dimension reduction in regression

    Full text link
    We present a new methodology for sufficient dimension reduction (SDR). Our methodology derives directly from the formulation of SDR in terms of the conditional independence of the covariate XX from the response YY, given the projection of XX on the central subspace [cf. J. Amer. Statist. Assoc. 86 (1991) 316--342 and Regression Graphics (1998) Wiley]. We show that this conditional independence assertion can be characterized in terms of conditional covariance operators on reproducing kernel Hilbert spaces and we show how this characterization leads to an MM-estimator for the central subspace. The resulting estimator is shown to be consistent under weak conditions; in particular, we do not have to impose linearity or ellipticity conditions of the kinds that are generally invoked for SDR methods. We also present empirical results showing that the new methodology is competitive in practice.Comment: Published in at http://dx.doi.org/10.1214/08-AOS637 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Identifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles

    Full text link
    Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors referred to by these verbs are organised into hierarchical structures of superordinate and subordinate groups
    • …
    corecore