2,295 research outputs found

    Learning to distinguish hypernyms and co-hyponyms

    Get PDF
    This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when there is minimal information about the target words in the training data, giving a 15% reduction in error rate over unsupervised approaches

    The Conception of Anthropological Complementarism. An Introduction

    Get PDF
    The aims of 'Anthropological Complementarism' in a nutshell(sect. 1). Against a watered-down conception of psychophysical complementarity (sect. 2). Linguistic and logical problems of identity and non-identity (sect. 3). A 'noematic' approach to consciousness (sect. 4). A plea for a pure noematics (sect. 5). My own consciousness as experienced by myself is not a part of nature (sect. 6). The major ontological tenets of mine (sect. 7). Complementarism proper (sect. 8). Suitable and unsuitable methods in philosophy (sect. 9). How to determine the methods suitable for philosophical inquiries (sect. 10). Linguistic and phenomenological methods (sect. 11). 'Linguistic phenomenology' (sect. 12). A note on philosophical truth (sect. 13)

    A Continuously Growing Dataset of Sentential Paraphrases

    Full text link
    A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. In addition, we show that more than 30,000 new sentential paraphrases can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201

    Information fusion for automated question answering

    Get PDF
    Until recently, research efforts in automated Question Answering (QA) have mainly focused on getting a good understanding of questions to retrieve correct answers. This includes deep parsing, lookups in ontologies, question typing and machine learning of answer patterns appropriate to question forms. In contrast, I have focused on the analysis of the relationships between answer candidates as provided in open domain QA on multiple documents. I argue that such candidates have intrinsic properties, partly regardless of the question, and those properties can be exploited to provide better quality and more user-oriented answers in QA.Information fusion refers to the technique of merging pieces of information from different sources. In QA over free text, it is motivated by the frequency with which different answer candidates are found in different locations, leading to a multiplicity of answers. The reason for such multiplicity is, in part, the massive amount of data used for answering, and also its unstructured and heterogeneous content: Besides am¬ biguities in user questions leading to heterogeneity in extractions, systems have to deal with redundancy, granularity and possible contradictory information. Hence the need for answer candidate comparison. While frequency has proved to be a significant char¬ acteristic of a correct answer, I evaluate the value of other relationships characterizing answer variability and redundancy.Partially inspired by recent developments in multi-document summarization, I re¬ define the concept of "answer" within an engineering approach to QA based on the Model-View-Controller (MVC) pattern of user interface design. An "answer model" is a directed graph in which nodes correspond to entities projected from extractions and edges convey relationships between such nodes. The graph represents the fusion of information contained in the set of extractions. Different views of the answer model can be produced, capturing the fact that the same answer can be expressed and pre¬ sented in various ways: picture, video, sound, written or spoken language, or a formal data structure. Within this framework, an answer is a structured object contained in the model and retrieved by a strategy to build a particular view depending on the end user (or taskj's requirements.I describe shallow techniques to compare entities and enrich the model by discovering four broad categories of relationships between entities in the model: equivalence, inclusion, aggregation and alternative. Quantitatively, answer candidate modeling im¬ proves answer extraction accuracy. It also proves to be more robust to incorrect answer candidates than traditional techniques. Qualitatively, models provide meta-information encoded by relationships that allow shallow reasoning to help organize and generate the final output

    A Discriminative Analysis of Fine-Grained Semantic Relations including Presupposition: Annotation and Classification

    Get PDF
    In contrast to classical lexical semantic relations between verbs, such as antonymy, synonymy or hypernymy, presupposition is a lexically triggered semantic relation that is not well covered in existing lexical resources. It is also understudied in the field of corpus-based methods of learning semantic relations. Yet, presupposition is very important for semantic and discourse analysis tasks, given the implicit information that it conveys. In this paper we present a corpus-based method for acquiring presupposition-triggering verbs along with verbal relata that express their presupposed meaning. We approach this difficult task using a discriminative classification method that jointly determines and distinguishes a broader set of inferential semantic relations between verbs. The present paper focuses on important methodological aspects of our work: (i) a discriminative analysis of the semantic properties of the chosen set of relations, (ii) the selection of features for corpus-based classification and (iii) design decisions for the manual annotation of fine-grained semantic relations between verbs. (iv) We present the results of a practical annotation effort leading to a gold standard resource for our relation inventory, and (v) we report results for automatic classification of our target set of fine-grained semantic relations, including presupposition. We achieve a classification performance of 55% F1-score, a 100% improvement over a best-feature baseline

    Using Tree Kernels for Classifying Temporal Relations between Events

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Necessitarianism and Dispositions

    Get PDF
    In this paper, I argue in favor of necessitarianism, the view that dispositions, when stimulated, necessitate their manifestations. After introducing and clarifying what necessitarianism does and does not amount to, I provide reasons to support the view that dispositions once stimulated necessitate their manifestations according to the stimulating conditions and the relevant properties at stake. In this framework, I will propose a principle of causal relevance and some conditions for the possibility of interference that allow us to avoid the use of ceteris paribus clauses. I then defend necessitarianism from recent attacks raised by, among others, Mumford and Anjum, noting that the antecedent strengthening test is a test for causal relevance that raises no difficulties for necessitarianism

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201
    • …
    corecore