Search CORE

10,294 research outputs found

Open-Retrieval Conversational Question Answering

Author: Chen Y.
Chuklin A.
Clark C.
Das R.
Devlin J.
Dhingra B.
Dunn M.
Garg S.
Huang H.-Y.
Johnson J.
Kwiatkowski T.
Lan Z.-Z.
Nguyen T.
Reddy S.
Shrivastava A.
Thomas P.
Trippas J. R.
Trischler A.
Vaswani A.
Voorhees E. M.
Wang M.
Wang S.
Wu Y.
Yang L.
Yang W.
Yatskar M.
Zhang Y.
Zhu C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/05/2020
Field of study

Conversational search is one of the ultimate goals of information retrieval. Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage. These simplifications neglect the fundamental role of retrieval in conversational search. To address this limitation, we introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers, as a further step towards building functional conversational search systems. We create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader that are all based on Transformers. Our extensive experiments on OR-QuAC demonstrate that a learnable retriever is crucial for ORConvQA. We further show that our system can make a substantial improvement when we enable history modeling in all system components. Moreover, we show that the reranker component contributes to the model performance by providing a regularization effect. Finally, further in-depth analyses are performed to provide new insights into ORConvQA.Comment: Accepted to SIGIR'2

arXiv.org e-Print Archive

Crossref

Learning language through pictures

Author: Alishahi Afra
Chrupała Grzegorz
Kádár Ákos
Publication venue
Publication date: 01/01/2015
Field of study

We propose Imaginet, a model of learning visually grounded representations of language from coupled textual and visual input. The model consists of two Gated Recurrent Unit networks with shared word embeddings, and uses a multi-task objective by receiving a textual description of a scene and trying to concurrently predict its visual representation and the next word in the sentence. Mimicking an important aspect of human language learning, it acquires meaning representations for individual words from descriptions of visual scenes. Moreover, it learns to effectively use sequential structure in semantic interpretation of multi-word phrases.Comment: To appear at ACL 201

arXiv.org e-Print Archive

Crossref

From engineering models to knowledge graph : delivering new insights into models

Author: Berquand Audrey
Riccardi Annalisa
Publication venue
Publication date: 30/09/2020
Field of study

Essential information on the early stages of a mission design is contained in Engineering Models. Yet, these models are often uneasy to visualise, query, let alone compare. This study demonstrates how Knowledge Graphs can overcome these data silos, interconnect information, provide a big-picture perspective, and infer new knowledge that would have remained hidden otherwise. Following the migration of CubeSats Engineering Models to a Knowledge Graph, two case studies are explored. The first case study illustrates how graph inference can derive implicit knowledge from existing explicit concepts. In the second case study, a Natural Language Processing layer is adjoined to the Knowledge Graph to enhances the analysis of textual content. The Natural Language Processing layer relies on the document embedding method doc2v

University of Strathclyde Institutional Repository

Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT

Author: Dredze Mark
Wu Shijie
Publication venue
Publication date: 01/01/2019
Field of study

Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. We compare mBERT with the best-published methods for zero-shot cross-lingual transfer and find mBERT competitive on each task. Additionally, we investigate the most effective strategy for utilizing mBERT in this manner, determine to what extent mBERT generalizes away from language specific features, and measure factors that influence cross-lingual transfer.Comment: EMNLP 2019 Camera Read

arXiv.org e-Print Archive

Crossref

Dynamic Parameter Allocation in Parameter Servers

Author: Gemulla Rainer
Markl Volker
Renz-Wieland Alexander
Zeuch Steffen
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2020
Field of study

To keep up with increasing dataset sizes and model complexity, distributed training has become a necessity for large machine learning tasks. Parameter servers ease the implementation of distributed parameter management---a key concern in distributed training---, but can induce severe communication overhead. To reduce communication overhead, distributed machine learning algorithms use techniques to increase parameter access locality (PAL), achieving up to linear speed-ups. We found that existing parameter servers provide only limited support for PAL techniques, however, and therefore prevent efficient training. In this paper, we explore whether and to what extent PAL techniques can be supported, and whether such support is beneficial. We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse, and experimentally compare its performance to existing parameter servers across a number of machine learning tasks. We found that Lapse provides near-linear scaling and can be orders of magnitude faster than existing parameter servers

arXiv.org e-Print Archive

MAnnheim DOCument Server