408 research outputs found
Empirical Gaussian priors for cross-lingual transfer learning
Sequence model learning algorithms typically maximize log-likelihood minus
the norm of the model (or minimize Hamming loss + norm). In cross-lingual
part-of-speech (POS) tagging, our target language training data consists of
sequences of sentences with word-by-word labels projected from translations in
languages for which we have labeled data, via word alignments. Our training
data is therefore very noisy, and if Rademacher complexity is high, learning
algorithms are prone to overfit. Norm-based regularization assumes a constant
width and zero mean prior. We instead propose to use the source language
models to estimate the parameters of a Gaussian prior for learning new POS
taggers. This leads to significantly better performance in multi-source
transfer set-ups. We also present a drop-out version that injects (empirical)
Gaussian noise during online learning. Finally, we note that using empirical
Gaussian priors leads to much lower Rademacher complexity, and is superior to
optimally weighted model interpolation.Comment: Presented at NIPS 2015 Workshop on Transfer and Multi-Task Learnin
Baselines and test data for cross-lingual inference
The recent years have seen a revival of interest in textual entailment,
sparked by i) the emergence of powerful deep neural network learners for
natural language processing and ii) the timely development of large-scale
evaluation datasets such as SNLI. Recast as natural language inference, the
problem now amounts to detecting the relation between pairs of statements: they
either contradict or entail one another, or they are mutually neutral. Current
research in natural language inference is effectively exclusive to English. In
this paper, we propose to advance the research in SNLI-style natural language
inference toward multilingual evaluation. To that end, we provide test data for
four major languages: Arabic, French, Spanish, and Russian. We experiment with
a set of baselines. Our systems are based on cross-lingual word embeddings and
machine translation. While our best system scores an average accuracy of just
over 75%, we focus largely on enabling further research in multilingual
inference.Comment: To appear at LREC 201
Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation
The Differentiable Search Index (DSI) is an emerging paradigm for information
retrieval. Unlike traditional retrieval architectures where index and retrieval
are two different and separate components, DSI uses a single transformer model
to perform both indexing and retrieval.
In this paper, we identify and tackle an important issue of current DSI
models: the data distribution mismatch that occurs between the DSI indexing and
retrieval processes. Specifically, we argue that, at indexing, current DSI
methods learn to build connections between the text of long documents and the
identifier of the documents, but then retrieval of document identifiers is
based on queries that are commonly much shorter than the indexed documents.
This problem is further exacerbated when using DSI for cross-lingual retrieval,
where document text and query text are in different languages.
To address this fundamental problem of current DSI models, we propose a
simple yet effective indexing framework for DSI, called DSI-QG. When indexing,
DSI-QG represents documents with a number of potentially relevant queries
generated by a query generation model and re-ranked and filtered by a
cross-encoder ranker. The presence of these queries at indexing allows the DSI
models to connect a document identifier to a set of queries, hence mitigating
data distribution mismatches present between the indexing and the retrieval
phases. Empirical results on popular mono-lingual and cross-lingual passage
retrieval datasets show that DSI-QG significantly outperforms the original DSI
model.Comment: 11 page
- …