27 research outputs found

    Study of Probabilistic Parsing in Syntactic Analysis

    Get PDF
    Statistical parser, like statistical tagging requires a corpus of hand –parsed text. There are such corpora available, the most notably being the Penn-tree bank. The Penn-tree bank is large corpus of articles from the Wall Street Journal that have been tagged with Penn tree-Bank tags and then parsed accordingly to a simple set of phrase structure rules conforming to Chomsky Government and binding syntax

    Logic Programs vs. First-Order Formulas in Textual Inference

    Get PDF
    In the problem of recognizing textual entailment, the goal is to decide, given a text and a hypothesis expressed in a natural language, whether a human reasoner would call the hypothesis a consequence of the text. One approach to this problem is to use a first-order reasoning tool to check whether the hypothesis can be derived from the text conjoined with relevant background knowledge, after expressing all of them by first-order formulas. Another possibility is to express the hypothesis, the text, and the background knowledge in a logic programming language, and use a logic programming system. We discuss the relation of these methods to each other and to the class of effectively propositional reasoning problems. This leads us to general conclusions regarding the relationship between classical logic and answer set programming as knowledge representation formalisms

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201

    Techniques for recognizing textual entailment and semantic equivalence

    Get PDF
    After defining what is understood by textual entailment and semantic equivalence, the present state and the desirable future of the systems aimed at recognizing them is shown. A compilation of the currently implemented techniques in the main Recognizing Textual Entailment and Semantic Equivalence systems is given

    Técnicas aplicadas al reconocimiento de implicación textual.

    Get PDF
    Tras establecer qué se entiende por implicación textual, se expone la situación actual y el futuro deseable de los sistemas dirigidos a reconocerla. Se realiza una identificación de las técnicas que implementan actualmente los principales sistemas de Reconocimiento de Implicación Textual

    Finding answers to questions, in text collections or web, in open domain or specialty domains

    Get PDF
    International audienceThis chapter is dedicated to factual question answering, i.e. extracting precise and exact answers to question given in natural language from texts. A question in natural language gives more information than a bag of word query (i.e. a query made of a list of words), and provides clues for finding precise answers. We will first focus on the presentation of the underlying problems mainly due to the existence of linguistic variations between questions and their answerable pieces of texts for selecting relevant passages and extracting reliable answers. We will first present how to answer factual question in open domain. We will also present answering questions in specialty domain as it requires dealing with semi-structured knowledge and specialized terminologies, and can lead to different applications, as information management in corporations for example. Searching answers on the Web constitutes another application frame and introduces specificities linked to Web redundancy or collaborative usage. Besides, the Web is also multilingual, and a challenging problem consists in searching answers in target language documents other than the source language of the question. For all these topics, we present main approaches and the remaining problems

    Methods for Measuring Semantic Similarity of Texts

    Get PDF
    A thesis submitted in partial ful lment of the requirements of the University of Wolverhampton for the degree of Doctor of PhilosophyMeasuring semantic similarity is a task needed in many Natural Language Processing (NLP) applications. For example, in Machine Translation evaluation, semantic similarity is used to assess the quality of the machine translation output by measuring the degree of equivalence between a reference translation and the machine translation output. The problem of semantic similarity (Corley and Mihalcea, 2005) is de ned as measuring and recognising semantic relations between two texts. Semantic similarity covers di erent types of semantic relations, mainly bidirectional and directional. This thesis proposes new methods to address the limitations of existing work on both types of semantic relations. Recognising Textual Entailment (RTE) is a directional relation where a text T entails the hypothesis H (entailment pair) if the meaning of H can be inferred from the meaning of T (Dagan and Glickman, 2005; Dagan et al., 2013). Most of the RTE methods rely on machine learning algorithms. de Marne e et al. (2006) propose a multi-stage architecture where a rst stage determines an alignment between the T-H pairs to be followed by an entailment decision stage. A limitation of such approaches is that instead of recognising a non-entailment, an alignment that ts an optimisation criterion will be returned, but the alignment by itself is a poor predictor for iii non-entailment. We propose an RTE method following a multi-stage architecture, where both stages are based on semantic representations. Furthermore, instead of using simple similarity metrics to predict the entailment decision, we use a Markov Logic Network (MLN). The MLN is based on rich relational features extracted from the output of the predicate-argument alignment structures between T-H pairs. This MLN learns to reward pairs with similar predicates and similar arguments, and penalise pairs otherwise. The proposed methods show promising results. A source of errors was found to be the alignment step, which has low coverage. However, we show that when an alignment is found, the relational features improve the nal entailment decision. The task of Semantic Textual Similarity (STS) (Agirre et al., 2012) is de- ned as measuring the degree of bidirectional semantic equivalence between a pair of texts. The STS evaluation campaigns use datasets that consist of pairs of texts from NLP tasks such as Paraphrasing and Machine Translation evaluation. Methods for STS are commonly based on computing similarity metrics between the pair of sentences, where the similarity scores are used as features to train regression algorithms. Existing methods for STS achieve high performances over certain tasks, but poor results over others, particularly on unknown (surprise) tasks. Our solution to alleviate this unbalanced performances is to model STS in the context of Multi-task Learning using Gaussian Processes (MTL-GP) ( Alvarez et al., 2012) and state-of-the-art iv STS features ( Sari c et al., 2012). We show that the MTL-GP outperforms previous work on the same datasets

    Recognizing Textual Entailment Using Description Logic And Semantic Relatedness

    Get PDF
    Textual entailment (TE) is a relation that holds between two pieces of text where one reading the first piece can conclude that the second is most likely true. Accurate approaches for textual entailment can be beneficial to various natural language processing (NLP) applications such as: question answering, information extraction, summarization, and even machine translation. For this reason, research on textual entailment has attracted a significant amount of attention in recent years. A robust logical-based meaning representation of text is very hard to build, therefore the majority of textual entailment approaches rely on syntactic methods or shallow semantic alternatives. In addition, approaches that do use a logical-based meaning representation, require a large knowledge base of axioms and inference rules that are rarely available. The goal of this thesis is to design an efficient description logic based approach for recognizing textual entailment that uses semantic relatedness information as an alternative to large knowledge base of axioms and inference rules. In this thesis, we propose a description logic and semantic relatedness approach to textual entailment, where the type of semantic relatedness axioms employed in aligning the description logic representations are used as indicators of textual entailment. In our approach, the text and the hypothesis are first represented in description logic. The representations are enriched with additional semantic knowledge acquired by using the web as a corpus. The hypothesis is then merged into the text representation by learning semantic relatedness axioms on demand and a reasoner is then used to reason over the aligned representation. Finally, the types of axioms employed by the reasoner are used to learn if the text entails the hypothesis or not. To validate our approach we have implemented an RTE system named AORTE, and evaluated its performance on recognizing textual entailment using the fourth recognizing textual entailment challenge. Our approach achieved an accuracy of 68.8 on the two way task and 61.6 on the three way task which ranked the approach as 2nd when compared to the other participating runs in the same challenge. These results show that our description logical based approach can effectively be used to recognize textual entailment

    Language modeling approaches to question answering

    Get PDF
    In today’s environment of information overload, Question Answering (QA) is a critically important research area. QA is the task of automatically extracting a precise answer from one or more data sources to a question posed in natural language. A twostage strategy is typically adopted when designing a QA system; the first stage is an Information Retrieval (IR) process which returns a set of candidate documents relevant to the question and the second stage narrows the information contained in those passages down to a single response (sentence or entity) that answers the question, typically using Information Extraction (IE) or Natural Language Processing methods. This research proposes novel techniques for QA by enhancing the user’s original query with latent semantic information from the corpus. This enhanced query is then applied to both the first and second stages of the QA architecture. To build the enhanced query, we propose the Aspect-Based Relevance Language Model as an approach that uses statistical language modeling techniques to measure the likelihood of relevance of a concept (oraspect as defined by Probabilistic Latent Semantic Analysis) to a question. We then use terms from the aspects that have the highest likelihood of relevance to design a model for a semantic Question Context, which includes sense-disambiguated terms than amplify the user’s query. Question Context is incorporated into the first state of QA as query expansion to improve recall. We then derive a novel measure called Answer Credibility from the Question Context. Answer Credibility may be thought of as a statistical measure of the reliability of a candidate answer with respect to a question and the source text from which the candidate answer was derived. We incorporate Answer Credibility in the Answer Validation process; the answer with the highest score after the application of Answer Credibility is returned to the user. Our techniques show performance improvements over state-of-the-art approaches, and have the advantage that they use statistical techniques to derive semantic information to aid the process of QA.Ph.D., Information Science and Technology -- Drexel University, 200
    corecore