567 research outputs found

    Automatic learning of textual entailments with cross-pair similarities

    Get PDF
    In this paper we define a novel similarity measure between examples of textual entailments and we use it as a kernel function in Support Vector Machines (SVMs). This allows us to automatically learn the rewrite rules that describe a non trivial set of entailment cases. The experiments with the data sets of the RTE 2005 challenge show an improvement of 4.4% over the state-of-the-art methods

    Linguistic redundancy in Twitter

    Get PDF
    In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of micro-blogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual En-tailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance

    A Machine learning approach to textual entailment recognition

    Get PDF
    Designing models for learning textual entailment recognizers from annotated examples is not an easy task, as it requires modeling the semantic relations and interactions involved between two pairs of text fragments. In this paper, we approach the problem by first introducing the class of pair feature spaces, which allow supervised machine learning algorithms to derive first-order rewrite rules from annotated examples. In particular, we propose syntactic and shallow semantic feature spaces, and compare them to standard ones. Extensive experiments demonstrate that our proposed spaces learn first-order derivations, while standard ones are not expressive enough to do so

    DLSITE-1: lexical analysis for solving textual entailment recognition

    Get PDF
    This paper discusses the recognition of textual entailment in a text-hypothesis pair by applying a wide variety of lexical measures. We consider that the entailment phenomenon can be tackled from three general levels: lexical, syntactic and semantic. The main goals of this research are to deal with this phenomenon from a lexical point of view, and achieve high results considering only such kind of knowledge. To accomplish this, the information provided by the lexical measures is used as a set of features for a Support Vector Machine which will decide if the entailment relation is produced. A study of the most relevant features and a comparison with the best state-of-the-art textual entailment systems is exposed throughout the paper. Finally, the system has been evaluated using the Second PASCAL Recognising Textual Entailment Challenge data and evaluation methodology, obtaining an accuracy rate of 61.88%.QALL-ME consortium, 6º Programa Marco, Unión Europea, referencia del proyecto FP6-IST-033860. Gobierno de España, proyecto CICyT número TIN2006-1526-C06-01

    Sociocultural Norm Similarities and Differences via Situational Alignment and Explainable Textual Entailment

    Full text link
    Designing systems that can reason across cultures requires that they are grounded in the norms of the contexts in which they operate. However, current research on developing computational models of social norms has primarily focused on American society. Here, we propose a novel approach to discover and compare descriptive social norms across Chinese and American cultures. We demonstrate our approach by leveraging discussions on a Chinese Q&A platform (Zhihu) and the existing SocialChemistry dataset as proxies for contrasting cultural axes, align social situations cross-culturally, and extract social norms from texts using in-context learning. Embedding Chain-of-Thought prompting in a human-AI collaborative framework, we build a high-quality dataset of 3,069 social norms aligned with social situations across Chinese and American cultures alongside corresponding free-text explanations. To test the ability of models to reason about social norms across cultures, we introduce the task of explainable social norm entailment, showing that existing models under 3B parameters have significant room for improvement in both automatic and human evaluation. Further analysis of cross-cultural norm differences based on our dataset shows empirical alignment with the social orientations framework, revealing several situational and descriptive nuances in norms across these cultures.Comment: EMNLP 2023 Main Conference (Long Paper

    Cross-Lingual Textual Entailment and Applications

    Get PDF
    Textual Entailment (TE) has been proposed as a generic framework for modeling language variability. The great potential of integrating (monolingual) TE recognition components into NLP architectures has been reported in several areas, such as question answering, information retrieval, information extraction and document summarization. Mainly due to the absence of cross-lingual TE (CLTE) recognition components, similar improvements have not yet been achieved in any corresponding cross-lingual application. In this thesis, we propose and investigate Cross-Lingual Textual Entailment (CLTE) as a semantic relation between two text portions in dierent languages. We present dierent practical solutions to approach this problem by i) bringing CLTE back to the monolingual scenario, translating the two texts into the same language; and ii) integrating machine translation and TE algorithms and techniques. We argue that CLTE can be a core tech- nology for several cross-lingual NLP applications and tasks. Experiments on dierent datasets and two interesting cross-lingual NLP applications, namely content synchronization and machine translation evaluation, conrm the eectiveness of our approaches leading to successful results. As a complement to the research in the algorithmic side, we successfully explored the creation of cross-lingual textual entailment corpora by means of crowdsourcing, as a cheap and replicable data collection methodology that minimizes the manual work done by expert annotators

    Unsupervised Learning of Relational Entailment Graphs from Text

    Get PDF
    Recognizing textual entailment and paraphrasing is critical to many core natural language processing applications including question answering and semantic parsing. The surface form of a sentence that answers a question such as “Does Facebook own Instagram?” frequently does not directly correspond to the form of the question, but is rather a paraphrase or an expression such as “Facebook bought Instagram”, that entails the answer. Relational entailments (e.g., buys entails owns) are crucial for bridging the gap between queries and text resources. In this thesis, we describe different unsupervised approaches to construct relational entailment graphs, with typed relations (e.g., company buys company) as nodes and entailment as directed edges. The entailment graphs provide an explainable resource for downstream tasks such as question answering; however, the existing methods suffer from noise and sparsity inherent to the data. We extract predicate-argument structures from large multiple-source news corpora using a fast Combinatory Categorial Grammar parser. We compute entailment scores between relations based on the Distributional Inclusion Hypothesis which states that a word (relation) p entails another word (relation) q if and only if in any context that p can be used, q can be used in its place. The entailment scores are used to build local entailment graphs. We then build global entailment graphs by exploiting the dependencies between the entailment rules. Previous work has used transitivity constraints, but these constraints are intractable on large graphs. We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph. We show that our method significantly improves the entailment graphs. Additionally, we show the duality of entailment graph induction with the task of link prediction. The link prediction task infers missing relations between entities in an incomplete knowledge graph and discovers new facts. We present a new method in which link prediction on the knowledge graph of assertions extracted from raw text is used to improve entailment graphs which are learned from the same text. The entailment graphs are in turn used to improve the link prediction task. Finally, we define the contextual link prediction task that uses both the structure of the knowledge graph of assertions and their textual contexts. We fine-tune pre-trained language models with an unsupervised contextual link prediction objective. We augment the existing assertions with novel predictions of our model and use them to build higher quality entailment graphs. Similarly, we show that the entailment graphs improve the contextual link prediction task
    corecore