2,659 research outputs found
End-to-End Multilingual Information Retrieval with Massively Large Synthetic Datasets
End-to-end neural networks have revolutionized various fields of artificial intelligence. However, advancements in the field of Cross-Lingual Information Retrieval (CLIR) have been stalled due to the lack of large-scale labeled data. CLIR is a retrieval task in which search queries and candidate documents are in different languages. CLIR can be very useful in some scenarios: for example, a reporter may want to search foreign-language news to obtain different perspectives for her story; an inventor may explore the patents in another country to understand prior art.
This dissertation addresses the bottleneck in end-to-end neural CLIR research by synthesizing large-scale CLIR training data and examining techniques that can exploit this in various CLIR tasks. We publicly release the Large-Scale CLIR dataset and CLIRMatrix, two synthetic CLIR datasets covering a large variety of language directions. We explore and evaluate several neural architectures for end-to-end CLIR modeling. Results show that multilingual information retrieval systems trained on these synthetic CLIR datasets are helpful for many language pairs, especially those in low-resource settings. We further show how these systems can be adapted to real-world scenarios
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
CNM: An Interpretable Complex-valued Network for Matching
This paper seeks to model human language by the mathematical framework of
quantum physics. With the well-designed mathematical formulations in quantum
physics, this framework unifies different linguistic units in a single
complex-valued vector space, e.g. words as particles in quantum states and
sentences as mixed systems. A complex-valued network is built to implement this
framework for semantic matching. With well-constrained complex-valued
components, the network admits interpretations to explicit physical meanings.
The proposed complex-valued network for matching (CNM) achieves comparable
performances to strong CNN and RNN baselines on two benchmarking question
answering (QA) datasets
- …