6 research outputs found

    A Statistical Approach with Syntactic and Semantic Features for Chinese Textual Entailment

    Get PDF
    [[abstract]]Recognizing Textual Entailment (RTE) is a PASCAL/TAC task in which two text fragments are processed by system to determine whether the meaning of hypothesis is entailed from another text or not. In this paper, we proposed a textual entailment system using a statistical approach that integrates syntactic and semantic techniques for Recognizing Inference in Text (RITE) using the NTCIR-9 RITE task and make a comparison between semantic and syntactic features based on their differences. We thoroughly evaluate our approach using subtasks of the NTCIR-9 RITE. As a result, our system achieved 73.28% accuracy on the Chinese Binary-Class (BC) subtask with NTCIR-9 RITE. Thorough experiments with the text fragments provided by the NTCIR-9 RITE task show that the proposed approach can significantly improve system accuracy.[[sponsorship]]IEEE[[incitationindex]]EI[[cooperationtype]]ćœ‹ć€–[[conferencetype]]朋際[[conferencedate]]20120808~20120810[[booktype]]é›»ć­ç‰ˆ[[iscallforpapers]]Y[[conferencelocation]]Vegas, Nevada, US

    Recognizing Textual Entailment Using Description Logic And Semantic Relatedness

    Get PDF
    Textual entailment (TE) is a relation that holds between two pieces of text where one reading the first piece can conclude that the second is most likely true. Accurate approaches for textual entailment can be beneficial to various natural language processing (NLP) applications such as: question answering, information extraction, summarization, and even machine translation. For this reason, research on textual entailment has attracted a significant amount of attention in recent years. A robust logical-based meaning representation of text is very hard to build, therefore the majority of textual entailment approaches rely on syntactic methods or shallow semantic alternatives. In addition, approaches that do use a logical-based meaning representation, require a large knowledge base of axioms and inference rules that are rarely available. The goal of this thesis is to design an efficient description logic based approach for recognizing textual entailment that uses semantic relatedness information as an alternative to large knowledge base of axioms and inference rules. In this thesis, we propose a description logic and semantic relatedness approach to textual entailment, where the type of semantic relatedness axioms employed in aligning the description logic representations are used as indicators of textual entailment. In our approach, the text and the hypothesis are first represented in description logic. The representations are enriched with additional semantic knowledge acquired by using the web as a corpus. The hypothesis is then merged into the text representation by learning semantic relatedness axioms on demand and a reasoner is then used to reason over the aligned representation. Finally, the types of axioms employed by the reasoner are used to learn if the text entails the hypothesis or not. To validate our approach we have implemented an RTE system named AORTE, and evaluated its performance on recognizing textual entailment using the fourth recognizing textual entailment challenge. Our approach achieved an accuracy of 68.8 on the two way task and 61.6 on the three way task which ranked the approach as 2nd when compared to the other participating runs in the same challenge. These results show that our description logical based approach can effectively be used to recognize textual entailment

    Recherche d'information et fouille de textes

    Get PDF
    National audienceIntroduction Comprendre un texte est un but que l'Intelligence Artificielle (IA) s'est fixĂ© depuis ses dĂ©buts et les premiers travaux apportant des rĂ©ponses ont vu le jour dans les annĂ©es 70s. Depuis lors, le thĂšme est toujours d'actualitĂ©, bien que les buts et mĂ©thodes qu'il recouvre aient considĂ©rablement Ă©voluĂ©s. Il est donc nĂ©cessaire de regarder de plus prĂšs ce qui se cache derriĂšre cette dĂ©nomination gĂ©nĂ©rale de « comprĂ©hension de texte ». Les premiers travaux, qui ont eu lieu du milieu des annĂ©es 70 jusqu'au milieu des annĂ©es 80 [Charniak 1972; Dyer 1983; Schank et al. 1977], Ă©tudiaient des textes relatant de courtes histoires et comprendre signifiait mettre en Ă©vidence les tenants et aboutissants de l'histoire-les sujets traitĂ©s, les Ă©vĂ©nements dĂ©crits, les relations de causalitĂ© les reliant-ainsi que le rĂŽle de chaque personnage, ses motivations et ses intentions. La comprĂ©hension Ă©tait vue comme un processus d'infĂ©rence visant Ă  expliciter tout l'implicite prĂ©sent dans un texte en le retrouvant Ă  partir des connaissances sĂ©mantiques et pragmatiques dont disposait la machine. Cela prĂ©supposait une modĂ©lisation prĂ©alable de ces connaissances. On rejoint ici les travaux effectuĂ©s sur les diffĂ©rents formalismes de reprĂ©sentation des connaissances en IA, dĂ©crivant d'une part les sens associĂ©s aux mots de la langue (rĂ©seaux sĂ©mantiques vs logique, et notamment graphes conceptuels [Sowa 1984] et d'autre part les connaissances pragmatiques [Schank 1982]. Tous ces travaux ont montrĂ© leur limite dĂšs lors qu'il s'agissait de modĂ©liser manuellement ces connaissances pour tous les domaines, ou de les apprendre automatiquement. Le problĂšme de la comprĂ©hension automatique en domaine ouvert restait donc entier. Puisque le problĂšme ainsi posĂ© est insoluble en l'Ă©tat des connaissances, une approche alternative consiste Ă  le redĂ©finir et Ă  le dĂ©composer en sous-tĂąches potentiellement plus faciles Ă  rĂ©soudre. Ainsi la comprĂ©hension de texte peut ĂȘtre redĂ©finie selon diffĂ©rents points de vue sur le texte qui permettent de rĂ©pondre Ă  des besoins spĂ©cifiques. De mĂȘme qu'un lecteur ne lit pas un texte de façon identique selon qu'il veut Ă©valuer sa pertinence par rapport Ă  un thĂšme qui l'intĂ©resse (tĂąche de type recherche documentaire), qu'il veut classer des documents, prendre connaissances des Ă©vĂ©nements relatĂ©s ou rechercher une information prĂ©cise, de mĂȘme les processus automatiques seront multiples et s'intĂ©resseront Ă  des aspects diffĂ©rents du texte en fonction de la tĂąche visĂ©e. Suivant le type de connaissance cherchĂ© dans un document, le lecteur n'extraira du texte que l'information qui l'intĂ©resse et s'appuiera pour cela sur les indices et sur les connaissances qui lui permettent de rĂ©aliser sa tĂąche de lecture, et donc de comprĂ©hension, sans avoir Ă  tout assimiler. On peut alors parler de comprĂ©hension Ă  niveaux variables, qui va permettre d'accĂ©der Ă  des niveaux de sens diffĂ©rents. Cette dĂ©marche est bien illustrĂ©e par les travaux en extraction d'information, Ă©valuĂ©s dans le cadre des confĂ©rences MUC [Grishman and Sundheim 1996], qui ont eu lieu de la fin des annĂ©es 1980 jusqu'en 1998. L'extraction d'information consistait alors Ă  modĂ©liser un besoin d'information par un patron, dĂ©crit par un ensemble d'attributs typĂ©s, et Ă  chercher Ă  remplir ces attributs selon l'information contenue dans les textes. C'est ainsi que se sont notamment dĂ©veloppĂ©es les recherches sur les « entitĂ©s nommĂ©es » (Ă  savoir le repĂ©rage de noms de personne, d'organisation, de lieu, de date, etc.) et sur les relations entre ces entitĂ©s. C'est aussi dans cette optique que se sont dĂ©veloppĂ©es les approches se situant au niveau du document, que ce soit pour la recherche d'information ou pour en dĂ©terminer la structur

    Finding answers to questions, in text collections or web, in open domain or specialty domains

    Get PDF
    International audienceThis chapter is dedicated to factual question answering, i.e. extracting precise and exact answers to question given in natural language from texts. A question in natural language gives more information than a bag of word query (i.e. a query made of a list of words), and provides clues for finding precise answers. We will first focus on the presentation of the underlying problems mainly due to the existence of linguistic variations between questions and their answerable pieces of texts for selecting relevant passages and extracting reliable answers. We will first present how to answer factual question in open domain. We will also present answering questions in specialty domain as it requires dealing with semi-structured knowledge and specialized terminologies, and can lead to different applications, as information management in corporations for example. Searching answers on the Web constitutes another application frame and introduces specificities linked to Web redundancy or collaborative usage. Besides, the Web is also multilingual, and a challenging problem consists in searching answers in target language documents other than the source language of the question. For all these topics, we present main approaches and the remaining problems

    Cross-Lingual Textual Entailment and Applications

    Get PDF
    Textual Entailment (TE) has been proposed as a generic framework for modeling language variability. The great potential of integrating (monolingual) TE recognition components into NLP architectures has been reported in several areas, such as question answering, information retrieval, information extraction and document summarization. Mainly due to the absence of cross-lingual TE (CLTE) recognition components, similar improvements have not yet been achieved in any corresponding cross-lingual application. In this thesis, we propose and investigate Cross-Lingual Textual Entailment (CLTE) as a semantic relation between two text portions in dierent languages. We present dierent practical solutions to approach this problem by i) bringing CLTE back to the monolingual scenario, translating the two texts into the same language; and ii) integrating machine translation and TE algorithms and techniques. We argue that CLTE can be a core tech- nology for several cross-lingual NLP applications and tasks. Experiments on dierent datasets and two interesting cross-lingual NLP applications, namely content synchronization and machine translation evaluation, conrm the eectiveness of our approaches leading to successful results. As a complement to the research in the algorithmic side, we successfully explored the creation of cross-lingual textual entailment corpora by means of crowdsourcing, as a cheap and replicable data collection methodology that minimizes the manual work done by expert annotators
    corecore