Search CORE

78 research outputs found

Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection

Author: Bulat LT
Kiela D
Rei M
Shutova E
Publication venue: EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
Publication date: 01/01/2017
Field of study

The ubiquity of metaphor in our everyday communication makes it an important problem for natural language understanding. Yet, the majority of metaphor processing systems to date rely on hand engineered features and there is still no consensus in the field as to which features are optimal for this task. In this paper, we present the first deep learning architecture designed to capture metaphorical composition. Our results demonstrate that it outperforms the existing approaches in the metaphor identification task

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Evaluation by association: A systematic study of quantitative word association evaluation

Author: Kiela D
Korhonen A
Vulić I
Publication venue: 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference
Publication date: 01/01/2017
Field of study

Recent work on evaluating representation learning architectures in NLP has established a need for evaluation protocols based on subconscious cognitive measures rather than manually tailored intrinsic similarity and relatedness tasks. In this work, we propose a novel evaluation framework that enables large-scale evaluation of such architectures in the free word association (WA) task, which is firmly grounded in cognitive theories of human semantic representation. This evaluation is facilitated by the existence of large manually constructed repositories of word association data. In this paper, we (1) present a detailed analysis of the new quantitative WA evaluation protocol, (2) suggest new evaluation metrics for the WA task inspired by its direct analogy with information retrieval problems, (3) evaluate various state-of-the-art representation models on this task, and (4) discuss the relationship between WA and prior evaluations of semantic representation with well-known similarity and relatedness evaluation sets. We have made the WA evaluation toolkit publicly available

Crossref

Apollo (Cambridge)

Recommended from our members

Vision and Feature Norms: Improving automatic feature norm learning through cross-modal maps

Author: Bulat L
Clark S
Kiela D
Publication venue: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Publication date: 01/01/2016
Field of study

Property norms have the potential to aid a wide range of semantic tasks, provided that they can be obtained for large numbers of concepts. Recent work has focused on text as the main source of information for automatic property extraction. In this paper we examine property norm prediction from visual, rather than textual, data, using cross-modal maps learnt between property norm and visual spaces. We also investigate the importance of having a complete feature norm dataset, for both training and testing. Finally, we evaluate how these datasets and cross-modal maps can be used in an image retrieval task.LB is supported by an EPSRC Doctoral Training Grant. DK is supported by EPSRC grant EP/I037512/1. SC is supported by ERC Starting Grant DisCoTex (306920) and EPSRC grant EP/I037512/1

Apollo (Cambridge)

Recommended from our members

Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics

Author: Clark S
Kiela D
Vero AL
Publication venue: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
Publication date: 30/12/2016
Field of study

Multi-modal distributional models learn grounded representations for improved performance in semantics. Deep visual representations, learned using convolutional neural networks, have been shown to achieve particularly high performance. In this study, we systematically compare deep visual representation learning techniques, experimenting with three well-known network architectures. In addition, we explore the various data sources that can be used for retrieving relevant images, showing that images from search engines perform as well as, or better than, those from manually crafted resources such as ImageNet. Furthermore, we explore the optimal number of images and the multi-lingual applicability of multi-modal semantics. We hope that these findings can serve as a guide for future research in the field.Anita Verõ is supported by the Nuance Foundation Grant: Learning Type-Driven Distributed Representations of Language. Stephen Clark is supported by the ERC Starting Grant: DisCoTex (306920)

Apollo (Cambridge)

Recommended from our members

Multi-Modal Representations for Improved Bilingual Lexicon Learning

Author: Clark S
Kiela D
Moens MF
Vulić I
Publication venue: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
Publication date: 13/08/2016
Field of study

Recent work has revealed the potential of using visual representations for bilingual lexicon learning (BLL). Such image-based BLL methods, however, still fall short of linguistic approaches. In this paper, we propose a simple yet effective multimodal approach that learns bilingual semantic representations that fuse linguistic and visual input. These new bilingual multi-modal embeddings display significant performance gains in the BLL task for three language pairs on two benchmarking test sets, outperforming linguistic-only BLL models using three different types of state-of-the-art bilingual word embeddings, as well as visual-only BLL models.This work is supported by ERC Consolidator Grant LEXICAL (648909) and KU Leuven Grant PDMK/14/117. SC is supported by ERC Starting Grant DisCoTex (306920)

Apollo (Cambridge)

Recommended from our members

Visually Grounded and Textual Semantic Models Differentially Decode Brain Activity Associated with Concrete and Abstract Nouns

Author: Anderson AJ
Clark SC
Kiela D
Poesio M
Publication venue: Transactions of the Association for Computational Linguistics
Publication date: 01/01/2017
Field of study

Important advances have recently been made using computational semantic models to decode brain activity patterns associated with concepts; however, this work has almost exclusively focused on concrete nouns. How well these models extend to decoding abstract nouns is largely unknown. We address this question by applying state-of-the-art computational models to decode functional Magnetic Resonance Imaging (fMRI) activity patterns, elicited by participants reading and imagining a diverse set of both concrete and abstract nouns. One of the models we use is linguistic, exploiting the recent word2vec skipgram approach trained on Wikipedia. The second is visually grounded, using deep convolutional neural networks trained on Google Images. Dual coding theory considers concrete concepts to be encoded in the brain both linguistically and visually, and abstract concepts only linguistically. Splitting the fMRI data according to human concreteness ratings, we indeed observe that both models significantly decode the most concrete nouns; however, accuracy is significantly greater using the text-based models for the most abstract nouns. More generally this confirms that current computational models are sufficiently advanced to assist in investigating the representational structure of abstract concepts in the brain.Stephen Clark is supported by ERC Starting Grant DisCoTex (306920)

Apollo (Cambridge)

Queen Mary Research Online

HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment

Author: Gerz D
Hill F
Kiela D
Korhonen A
Vulić I
Publication venue: Computational Linguistics
Publication date: 10/05/2017
Field of study

We introduce HyperLex — a dataset and evaluation resource that quantifies the extent of of the semantic category membership, that is, type-of relation also known as hyponymy–hypernymy or lexical entailment (LE) relation between 2,616 concept pairs. Cognitive psychology research has established that typicality and category/class membership are computed in human semantic memory as a gradual rather than binary relation. Nevertheless, most NLP research, and existing large-scale inventories of concept category membership (WordNet, DBPedia, etc.) treat category membership and LE as binary. To address this, we asked hundreds of native English speakers to indicate typicality and strength of category membership between a diverse range of concept pairs on a crowdsourcing platform. Our results confirm that category membership and LE are indeed more gradual than binary. We then compare these human judgments with the predictions of automatic systems, which reveals a huge gap between human performance and state-of-the-art LE, distributional and representation learning models, and substantial differences between the models themselves. We discuss a pathway for improving semantic models to overcome this discrepancy, and indicate future application areas for improved graded LE systems.This work is supported by the ERC Consolidator Grant (no 648909)

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Non-Compositional Term Dependence for Information Retrieval

Author: Fujita S.
Jeffreys H.
Jurafsky D.
Katz G.
Kiela D.
Krcmár L.
Metzler D. P.
Michelbacher L.
Pederson J.
Reddy S.
Reddy S.
Salehi B.
Salton G.
Salton G.
Singhal A.
Sparck-Jones K.
Strzalkowski T.
Thomason R. H.
Walde S. Schulte
Yu C. T.
Zhai C.
Publication venue
Publication date: 01/01/2015
Field of study

Modelling term dependence in IR aims to identify co-occurring terms that are too heavily dependent on each other to be treated as a bag of words, and to adapt the indexing and ranking accordingly. Dependent terms are predominantly identified using lexical frequency statistics, assuming that (a) if terms co-occur often enough in some corpus, they are semantically dependent; (b) the more often they co-occur, the more semantically dependent they are. This assumption is not always correct: the frequency of co-occurring terms can be separate from the strength of their semantic dependence. E.g. "red tape" might be overall less frequent than "tape measure" in some corpus, but this does not mean that "red"+"tape" are less dependent than "tape"+"measure". This is especially the case for non-compositional phrases, i.e. phrases whose meaning cannot be composed from the individual meanings of their terms (such as the phrase "red tape" meaning bureaucracy). Motivated by this lack of distinction between the frequency and strength of term dependence in IR, we present a principled approach for handling term dependence in queries, using both lexical frequency and semantic evidence. We focus on non-compositional phrases, extending a recent unsupervised model for their detection [21] to IR. Our approach, integrated into ranking using Markov Random Fields [31], yields effectiveness gains over competitive TREC baselines, showing that there is still room for improvement in the very well-studied area of term dependence in IR

arXiv.org e-Print Archive

CiteSeerX

Crossref

Copenhagen University Research Information System

VBN

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

Author: Du J
Iyer S
Kiela D
Lewis P
Li XL
Mehdad Y
Oğuz B
Riedel S
Wang W
Xiong W
Yih WT
Publication venue: ICLR
Publication date: 28/09/2020
Field of study

We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER. Contrary to previous work, our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers, and can be applied to any unstructured text corpus. Our system also yields a much better efficiency-accuracy trade-off, matching the best published accuracy on HotpotQA while being 10 times faster at inference time

UCL Discovery

Post-Translational Loss of Renal TRPV5 Calcium Channel Expression, Ca2+ Wasting, and Bone Loss in Experimental Colitis

Author: Ghishan Fayez K.
Kiela Pawel R.
Kuro-O Makoto
Larmonier Claire B.
Laubitz Daniel
McFadden Rita Marie T.
Midura-Kiela Monica T.
Radhakrishnan Vijayababu M.
Ramalingam Rajalakshmy
Thurston Robert D.
Publication venue
Publication date: 01/01/2013
Field of study

Dysregulated Ca2+ homeostasis likely contributes to the etiology of IBD-associated loss of bone mineral density (BMD). Experimental colitis leads to decreased expression of Klotho, a protein which supports renal Ca2+ reabsorption by stabilizing TRPV5 channel on the apical membrane of distal tubule epithelial cells

PubMed Central

Carolina Digital Repository