2,117 research outputs found

    knn-seq: Efficient, Extensible kNN-MT Framework

    Full text link
    k-nearest-neighbor machine translation (kNN-MT) boosts the translation quality of a pre-trained neural machine translation (NMT) model by utilizing translation examples during decoding. Translation examples are stored in a vector database, called a datastore, which contains one entry for each target token from the parallel data it is made from. Due to its size, it is computationally expensive both to construct and to retrieve examples from the datastore. In this paper, we present an efficient and extensible kNN-MT framework, knn-seq, for researchers and developers that is carefully designed to run efficiently, even with a billion-scale large datastore. knn-seq is developed as a plug-in on fairseq and easy to switch models and kNN indexes. Experimental results show that our implemented kNN-MT achieves a comparable gain to the original kNN-MT, and the billion-scale datastore construction took 2.21 hours in the WMT'19 German-to-English translation task. We publish our knn-seq as an MIT-licensed open-source project and the code is available on https://github.com/naist-nlp/knn-seq . The demo video is available on https://youtu.be/zTDzEOq80m0

    Small Width, Low Distortions: Quantized Random Embeddings of Low-complexity Sets

    Full text link
    Under which conditions and with which distortions can we preserve the pairwise-distances of low-complexity vectors, e.g., for structured sets such as the set of sparse vectors or the one of low-rank matrices, when these are mapped in a finite set of vectors? This work addresses this general question through the specific use of a quantized and dithered random linear mapping which combines, in the following order, a sub-Gaussian random projection in RM\mathbb R^M of vectors in RN\mathbb R^N, a random translation, or "dither", of the projected vectors and a uniform scalar quantizer of resolution δ>0\delta>0 applied componentwise. Thanks to this quantized mapping we are first able to show that, with high probability, an embedding of a bounded set K⊂RN\mathcal K \subset \mathbb R^N in δZM\delta \mathbb Z^M can be achieved when distances in the quantized and in the original domains are measured with the ℓ1\ell_1- and ℓ2\ell_2-norm, respectively, and provided the number of quantized observations MM is large before the square of the "Gaussian mean width" of K\mathcal K. In this case, we show that the embedding is actually "quasi-isometric" and only suffers of both multiplicative and additive distortions whose magnitudes decrease as M−1/5M^{-1/5} for general sets, and as M−1/2M^{-1/2} for structured set, when MM increases. Second, when one is only interested in characterizing the maximal distance separating two elements of K\mathcal K mapped to the same quantized vector, i.e., the "consistency width" of the mapping, we show that for a similar number of measurements and with high probability this width decays as M−1/4M^{-1/4} for general sets and as 1/M1/M for structured ones when MM increases. Finally, as an important aspect of our work, we also establish how the non-Gaussianity of the mapping impacts the class of vectors that can be embedded or whose consistency width provably decays when MM increases.Comment: Keywords: quantization, restricted isometry property, compressed sensing, dimensionality reduction. 31 pages, 1 figur

    Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

    Full text link
    Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

    Natural Language Processing with Small Feed-Forward Networks

    Full text link
    We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory budget.Comment: EMNLP 2017 short pape
    • …
    corecore