2,117 research outputs found
knn-seq: Efficient, Extensible kNN-MT Framework
k-nearest-neighbor machine translation (kNN-MT) boosts the translation
quality of a pre-trained neural machine translation (NMT) model by utilizing
translation examples during decoding. Translation examples are stored in a
vector database, called a datastore, which contains one entry for each target
token from the parallel data it is made from. Due to its size, it is
computationally expensive both to construct and to retrieve examples from the
datastore. In this paper, we present an efficient and extensible kNN-MT
framework, knn-seq, for researchers and developers that is carefully designed
to run efficiently, even with a billion-scale large datastore. knn-seq is
developed as a plug-in on fairseq and easy to switch models and kNN indexes.
Experimental results show that our implemented kNN-MT achieves a comparable
gain to the original kNN-MT, and the billion-scale datastore construction took
2.21 hours in the WMT'19 German-to-English translation task. We publish our
knn-seq as an MIT-licensed open-source project and the code is available on
https://github.com/naist-nlp/knn-seq . The demo video is available on
https://youtu.be/zTDzEOq80m0
Small Width, Low Distortions: Quantized Random Embeddings of Low-complexity Sets
Under which conditions and with which distortions can we preserve the
pairwise-distances of low-complexity vectors, e.g., for structured sets such as
the set of sparse vectors or the one of low-rank matrices, when these are
mapped in a finite set of vectors? This work addresses this general question
through the specific use of a quantized and dithered random linear mapping
which combines, in the following order, a sub-Gaussian random projection in
of vectors in , a random translation, or "dither",
of the projected vectors and a uniform scalar quantizer of resolution
applied componentwise. Thanks to this quantized mapping we are first
able to show that, with high probability, an embedding of a bounded set
in can be achieved when
distances in the quantized and in the original domains are measured with the
- and -norm, respectively, and provided the number of quantized
observations is large before the square of the "Gaussian mean width" of
. In this case, we show that the embedding is actually
"quasi-isometric" and only suffers of both multiplicative and additive
distortions whose magnitudes decrease as for general sets, and as
for structured set, when increases. Second, when one is only
interested in characterizing the maximal distance separating two elements of
mapped to the same quantized vector, i.e., the "consistency width"
of the mapping, we show that for a similar number of measurements and with high
probability this width decays as for general sets and as for
structured ones when increases. Finally, as an important aspect of our
work, we also establish how the non-Gaussianity of the mapping impacts the
class of vectors that can be embedded or whose consistency width provably
decays when increases.Comment: Keywords: quantization, restricted isometry property, compressed
sensing, dimensionality reduction. 31 pages, 1 figur
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Retrieval pipelines commonly rely on a term-based search to obtain candidate
records, which are subsequently re-ranked. Some candidates are missed by this
approach, e.g., due to a vocabulary mismatch. We address this issue by
replacing the term-based search with a generic k-NN retrieval algorithm, where
a similarity function can take into account subtle term associations. While an
exact brute-force k-NN search using this similarity function is slow, we
demonstrate that an approximate algorithm can be nearly two orders of magnitude
faster at the expense of only a small loss in accuracy. A retrieval pipeline
using an approximate k-NN search can be more effective and efficient than the
term-based pipeline. This opens up new possibilities for designing effective
retrieval pipelines. Our software (including data-generating code) and
derivative data based on the Stack Overflow collection is available online
Natural Language Processing with Small Feed-Forward Networks
We show that small and shallow feed-forward neural networks can achieve near
state-of-the-art results on a range of unstructured and structured language
processing tasks while being considerably cheaper in memory and computational
requirements than deep recurrent models. Motivated by resource-constrained
environments like mobile phones, we showcase simple techniques for obtaining
such small neural network models, and investigate different tradeoffs when
deciding how to allocate a small memory budget.Comment: EMNLP 2017 short pape
- …