20 research outputs found

    Automatic Contradiction Detection in Spanish

    Get PDF
    This paper addresses the lack of automated contradiction detection systems for the Spanish language. The ES-Contradiction dataset was created and contains examples with two pieces of information classified as Compatible, Contradiction, or Unrelated. To the author’s knowledge, a Spanish-language contradiction dataset is non-existent and therefore, the ES-Contradiction dataset fills an important research gap, given Spanish being one of the most widely spoken languages. Moreover, the dataset built includes a fine-grained annotation of the different types of contradictions in the dataset. A baseline system was designed to validate the effectiveness of the dataset. The BETO transformer model was used to build this baseline system, which obtained a good result to detect the three class labels Compatible, Contradiction, or Unrelated.This research work has been partially funded by Generalitat Valenciana through project “SIIA: Tecnologias del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” with grant reference PROMETEU/2018/089, by the Spanish Government through project RTI2018-094653-B-C22: “Modelang: Modeling the behavior of digital entities by Human Language Technologies”, as well as being partially supported by a grant from the Fondo Europeo de Desarrollo Regional (FEDER) and the LIVING-LANG project (RTI2018-094653-B-C21) from the Spanish Government

    Analysis of Identifying Linguistic Phenomena for Recognizing Inference in Text

    Get PDF
    [[abstract]]Recognizing Textual Entailment (RTE) is a task in which two text fragments are processed by system to determine whether the meaning of hypothesis is entailed from another text or not. Although a considerable number of studies have been made on recognizing textual entailment, little is known about the power of linguistic phenomenon for recognizing inference in text. The objective of this paper is to provide a comprehensive analysis of identifying linguistic phenomena for recognizing inference in text (RITE). In this paper, we focus on RITE-VAL System Validation subtask and propose a model by using an analysis of identifying linguistic phenomena for Recognizing Inference in Text (RITE) using the development dataset of NTCIR-11 RITE-VAL subtask. The experimental results suggest that well identified linguistic phenomenon category could enhance the accuracy of textual entailment system.[[sponsorship]]IEEE[[incitationindex]]EI[[conferencetype]]國際[[conferencedate]]20140813~20140815[[booktype]]電子版[[iscallforpapers]]Y[[conferencelocation]]San Francisco, California, US

    Lexical Opposition in Discourse Contrast

    Get PDF
    We investigate the connection between lexical opposition and discourse relations, with a focus on the relation of contrast, in order to evaluate whether opposition participates in discourse relations. Through a corpus-based analysis of Italian documents, we show that the relation between opposition and contrast is not crucial, although not insignificant in the case of implicit relation. The correlation is even weaker when other discourse relations are taken into account.Studiamo la connessione tra l’opposizione lessicale e le relazioni del discorso, con attenzione alla relazione di contrasto, per verificare se l’opposizione partecipa alle relazioni del discorso. Attraverso un’analisi basata su un corpus di documenti in italiano, mostriamo che la relazione tra opposizione e contrasto non è cruciale, anche se non priva di importanza soprattutto per i casi di contrasto implicito. La correlazione sembra più debole se consideriamo le altre relazioni del discorso

    Lexical Opposition in Discourse Contrast

    Get PDF
    We investigate the connection between lexical opposition and discourse relations, with a focus on the relation of contrast, in order to evaluate whether opposition participates in discourse relations. Through a corpus-based analysis of Italian documents, we show that the relation between opposition and contrast is not crucial, although not insignificant in the case of implicit relation. The correlation is even weaker when other discourse relations are taken into account.Studiamo la connessione tra l’opposizione lessicale e le relazioni del discorso, con attenzione alla relazione di contrasto, per verificare se l’opposizione partecipa alle relazioni del discorso. Attraverso un’analisi basata su un corpus di documenti in italiano, mostriamo che la relazione tra opposizione e contrasto non è cruciale, anche se non priva di importanza soprattutto per i casi di contrasto implicito. La correlazione sembra più debole se consideriamo le altre relazioni del discorso

    What is disputed on the web

    Get PDF
    ABSTRACT We present a method for automatically acquiring of a corpus of disputed claims from the web. We consider a factual claim to be disputed if a page on the web suggests both that the claim is false and also that other people say it is true. Our tool extracts disputed claims by searching the web for patterns such as "falsely claimed that X" and then using a statistical classifier to select text that appears to be making a disputed claim. We argue that such a corpus of disputed claims is useful for a wide range of applications related to information credibility on the web, and we report what our current corpus reveals about what is being disputed on the web

    Inferring Group Processes from Computer-Mediated Affective Text Analysis

    Get PDF
    Political communications in the form of unstructured text convey rich connotative meaning that can reveal underlying group social processes. Previous research has focused on sentiment analysis at the document level, but we extend this analysis to sub-document levels through a detailed analysis of affective relationships between entities extracted from a document. Instead of pure sentiment analysis, which is just positive or negative, we explore nuances of affective meaning in 22 affect categories. Our affect propagation algorithm automatically calculates and displays extracted affective relationships among entities in graphical form in our prototype (TEAMSTER), starting with seed lists of affect terms. Several useful metrics are defined to infer underlying group processes by aggregating affective relationships discovered in a text. Our approach has been validated with annotated documents from the MPQA corpus, achieving a performance gain of 74% over comparable random guessers

    A Neuro Symbolic Approach for Contradiction Detection in Persian Text

    Get PDF
    Detection of semantic contradictory sentences is a challenging and fundamental issue for some NLP applications, such as textual entailments recognition. In this study, contradiction means different types of semantic confrontation, such as negation, antonymy, and numerical. Due to the lack of sufficient data to apply precise machine learning and, specifically, deep learning methods to Persian and other low-resource languages, rule-based approaches are of great interest. Also, recently, the emergence of new methods such as transfer learning has opened up the possibility of deep learning for low-resource languages. This paper introduces a hybrid contradiction detection approach for detecting seven categories of contradictions in Persian texts: Antonymy, negation, numerical, factive, structural, lexical and world knowledge. The proposed method consists of 1) a novel data mining method and 2) a transformer-based deep neural method for contradiction detection . Also, a simple baseline is presented for comparison. The data mining method uses frequent rule mining to extract appropriate contradiction detection rules employing a development set. Extracted rules are tested for different categories of contradictory sentences. In the first step, a classifier checks whether the rules work for an input sentence pair. Then, according to the result, rules are used for three categories of negation, numerical, and antonym. In this part, the highest F-measure is obtained for detecting the negation category (90%), the average F-measure for these three categories is 86%, and for the other four categories, in which the rules have a lower F-measure of 62%, the transformer-based method achieved 76%. The proposed hybrid approach has an overall f-measure of higher than 80%.&nbsp
    corecore