Search CORE

4 research outputs found

Scaling Web-based Acquisition of Entailment Relations

Author: Bonaventura Coppola
Hristo Tanev
Idan Szpektor
Ido Kalman Dagan
Publication venue
Publication date
Field of study

Paraphrase recognition is a critical step for natural language interpretation. Accordingly, many NLP applications would benefit from high coverage knowledge bases of paraphrases. However, the scalability of state-of-the-art paraphrase acquisition approaches is still limited. We present a fully unsupervised learning algorithm for Web-based extraction of entailment relations, an extended model of paraphrases. We focus on increased scalability and generality with respect to prior work, eventually aiming at a full scale knowledge base. Our current implementation of the algorithm takes as its input a verb lexicon and for each verb searches the Web for related syntactic entailment templates. Experiments show promising results with respect to the ultimate goal, achieving much better scalability than prior Web-based method

Archivio della ricerca - Fondazione Bruno Kessler

Direct Word Sense Matching for Lexical Substitution

Author: Alfio Massimiliano Gliozzo
Carlo Strapparava
Efrat Marmorshtein
Ido Kalman Dagan
Oren Glickman
Publication venue
Publication date
Field of study

This paper investigates conceptually and empirically the novel sense matching task, which requires to recognize whether the senses of two synonymous words match in context. We suggest direct approaches to the problem, which avoid the intermediate step of explicit word sense disambiguation, and demonstrate their appealing advantages and stimulating potential for future research

Archivio della ricerca - Fondazione Bruno Kessler

A Resource for Investigating the Impact of Anaphora and Coreference on Inference

Author: A. Abad
A. Stern
Bentivogli Luisa
Dagan Ido Kalman
Giampiccolo Danilo
Pianta Emanuele
S. Mirkin
Publication venue
Publication date: 01/01/2010
Field of study

Discourse phenomena play a major role in text processing tasks. However, so far relatively little study has been devoted to the relevance of discourse phenomena for inference. Therefore, an experimental study was carried out to assess the relevance of anaphora and coreference for Textual Entailment (TE), a prominent inference framework. First, the annotation of anaphoric and coreferential links in the RTE-5 Search data set was performed according to a specifically designed annotation scheme. As a result, a new data set was created where all anaphora and coreference instances in the entailing sentences which are relevant to the entailment judgment are solved and annotated.. A by-product of the annotation is a new “augmented” data set, where all the referring expressions which need to be resolved in the entailing sentences are replaced by explicit expressions. Starting from the final output of the annotation, the actual impact of discourse phenomena on inference engines was investigated, identifying the kind of operations that the systems need to apply to address discourse phenomena and trying to find direct mappings between these operation and annotation types

Archivio della ricerca - Fondazione Bruno Kessler

Building Textual Entailment Specialized Data Sets: a Methodology for Isolating Linguistic Phenomena Relevant to Inference

Author: Bentivogli Luisa
Cabrio Elena
Dagan Ido Kalman
Giampiccolo Danilo
M. Lo Leggio
Magnini Bernardo
Publication venue: European Language Resources Association (ELRA)
Publication date
Field of study

This paper proposes a methodology for the creation of specialized data sets for Textual Entailment, made of monothematic Text-Hypothesis pairs (i.e. pairs in which only one linguistic phenomenon relevant to the entailment relation is highlighted and isolated). The expected benefits derive from the intuition that investigating the linguistic phenomena separately, i.e. decomposing the complexity of the TE problem, would yield an improvement in the development of specific strategies to cope with them. The annotation procedure assumes that humans have knowledge about the linguistic phenomena relevant to inference, and a classification of such phenomena both into fine grained and macro categories is suggested. We experimented with the proposed methodology over a sample of pairs taken from the RTE-5 data set, and investigated critical issues arising when entailment, contradiction or unknown pairs are considered. The result is a new resource, which can be profitably used both to advance the comprehension of the linguistic phenomena relevant to entailment judgments and to make a first step towards the creation of large-scale specialized data sets

CiteSeerX

Archivio della ricerca - Fondazione Bruno Kessler