2,659 research outputs found

    End-to-End Multilingual Information Retrieval with Massively Large Synthetic Datasets

    Get PDF
    End-to-end neural networks have revolutionized various fields of artificial intelligence. However, advancements in the field of Cross-Lingual Information Retrieval (CLIR) have been stalled due to the lack of large-scale labeled data. CLIR is a retrieval task in which search queries and candidate documents are in different languages. CLIR can be very useful in some scenarios: for example, a reporter may want to search foreign-language news to obtain different perspectives for her story; an inventor may explore the patents in another country to understand prior art. This dissertation addresses the bottleneck in end-to-end neural CLIR research by synthesizing large-scale CLIR training data and examining techniques that can exploit this in various CLIR tasks. We publicly release the Large-Scale CLIR dataset and CLIRMatrix, two synthetic CLIR datasets covering a large variety of language directions. We explore and evaluate several neural architectures for end-to-end CLIR modeling. Results show that multilingual information retrieval systems trained on these synthetic CLIR datasets are helpful for many language pairs, especially those in low-resource settings. We further show how these systems can be adapted to real-world scenarios

    Multi-Target Prediction: A Unifying View on Problems and Methods

    Full text link
    Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

    CNM: An Interpretable Complex-valued Network for Matching

    Full text link
    This paper seeks to model human language by the mathematical framework of quantum physics. With the well-designed mathematical formulations in quantum physics, this framework unifies different linguistic units in a single complex-valued vector space, e.g. words as particles in quantum states and sentences as mixed systems. A complex-valued network is built to implement this framework for semantic matching. With well-constrained complex-valued components, the network admits interpretations to explicit physical meanings. The proposed complex-valued network for matching (CNM) achieves comparable performances to strong CNN and RNN baselines on two benchmarking question answering (QA) datasets
    corecore