2,295 research outputs found
Learning to distinguish hypernyms and co-hyponyms
This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when there is minimal information about the target words in the training data, giving a 15% reduction in error rate over unsupervised approaches
Recommended from our members
Identifying lexical relationships and entailments with distributional semantics
Many modern efforts in Natural Language Understanding depend on rich and powerful semantic representations of words. Systems for sophisticated logical and textual reasoning often depend heavily on lexical resources to provide critical information about relationships between words, but these lexical resources are expensive to create and maintain, and are never fully comprehensive. Distributional Semantics has long offered methods for automatically inducing meaning representations from large corpora, with little or no annotation efforts. The resulting representations are valuable proxies of semantic similarity, but simply knowing two words are similar cannot tell us their relationship, or whether one entails the other.
In this thesis, we consider how methods from Distributional Semantics may be applied to the difficult task of lexical entailment, where one must predict whether one word implies another. We approach this by showing contributions in areas of hypernymy detection, lexical relationship prediction, lexical substitution, and textual entailment. We propose novel experimental setups, models, analysis, and interpretations, which ultimate provide us with a better understanding of both the nature of lexical entailment, as well as the information available within distributional representations.Computer Science
The Conception of Anthropological Complementarism. An Introduction
The aims of 'Anthropological Complementarism' in a nutshell(sect. 1). Against a watered-down conception of psychophysical complementarity (sect. 2). Linguistic and logical problems of identity and non-identity (sect. 3). A 'noematic' approach to consciousness (sect. 4). A plea for a pure noematics (sect. 5). My own consciousness as experienced by myself is not a part of nature (sect. 6). The major ontological tenets of mine (sect. 7). Complementarism proper (sect. 8). Suitable and unsuitable methods in philosophy (sect. 9). How to determine the methods suitable for philosophical inquiries (sect. 10). Linguistic and phenomenological methods (sect. 11). 'Linguistic phenomenology' (sect. 12). A note on philosophical truth (sect. 13)
A Continuously Growing Dataset of Sentential Paraphrases
A major challenge in paraphrase research is the lack of parallel corpora. In
this paper, we present a new method to collect large-scale sentential
paraphrases from Twitter by linking tweets through shared URLs. The main
advantage of our method is its simplicity, as it gets rid of the classifier or
human in the loop needed to select data before annotation and subsequent
application of paraphrase identification algorithms in the previous work. We
present the largest human-labeled paraphrase corpus to date of 51,524 sentence
pairs and the first cross-domain benchmarking for automatic paraphrase
identification. In addition, we show that more than 30,000 new sentential
paraphrases can be easily and continuously captured every month at ~70%
precision, and demonstrate their utility for downstream NLP tasks through
phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201
Information fusion for automated question answering
Until recently, research efforts in automated Question Answering (QA) have mainly
focused on getting a good understanding of questions to retrieve correct answers. This
includes deep parsing, lookups in ontologies, question typing and machine learning
of answer patterns appropriate to question forms. In contrast, I have focused on the
analysis of the relationships between answer candidates as provided in open domain
QA on multiple documents. I argue that such candidates have intrinsic properties,
partly regardless of the question, and those properties can be exploited to provide better
quality and more user-oriented answers in QA.Information fusion refers to the technique of merging pieces of information from
different sources. In QA over free text, it is motivated by the frequency with which
different answer candidates are found in different locations, leading to a multiplicity
of answers. The reason for such multiplicity is, in part, the massive amount of data
used for answering, and also its unstructured and heterogeneous content: Besides am¬
biguities in user questions leading to heterogeneity in extractions, systems have to deal
with redundancy, granularity and possible contradictory information. Hence the need
for answer candidate comparison. While frequency has proved to be a significant char¬
acteristic of a correct answer, I evaluate the value of other relationships characterizing
answer variability and redundancy.Partially inspired by recent developments in multi-document summarization, I re¬
define the concept of "answer" within an engineering approach to QA based on the
Model-View-Controller (MVC) pattern of user interface design. An "answer model"
is a directed graph in which nodes correspond to entities projected from extractions
and edges convey relationships between such nodes. The graph represents the fusion
of information contained in the set of extractions. Different views of the answer model
can be produced, capturing the fact that the same answer can be expressed and pre¬
sented in various ways: picture, video, sound, written or spoken language, or a formal
data structure. Within this framework, an answer is a structured object contained in the
model and retrieved by a strategy to build a particular view depending on the end user
(or taskj's requirements.I describe shallow techniques to compare entities and enrich the model by discovering four broad categories of relationships between entities in the model: equivalence,
inclusion, aggregation and alternative. Quantitatively, answer candidate modeling im¬
proves answer extraction accuracy. It also proves to be more robust to incorrect answer
candidates than traditional techniques. Qualitatively, models provide meta-information
encoded by relationships that allow shallow reasoning to help organize and generate
the final output
A Discriminative Analysis of Fine-Grained Semantic Relations including Presupposition: Annotation and Classification
In contrast to classical lexical semantic relations between verbs, such as antonymy, synonymy or hypernymy, presupposition is a lexically triggered semantic relation that is not well covered in existing lexical resources. It is also understudied in the field of corpus-based methods of learning semantic relations. Yet, presupposition is very important for semantic and discourse analysis tasks, given the implicit information that it conveys. In this paper we present a corpus-based method for acquiring presupposition-triggering verbs along with verbal relata that express their presupposed meaning. We approach this difficult task using a discriminative classification method that jointly determines and distinguishes a broader set of inferential semantic relations between verbs.
The present paper focuses on important methodological aspects of our work: (i) a discriminative analysis of the semantic properties of the chosen set of relations, (ii) the selection of features for corpus-based classification and (iii) design decisions for the manual annotation of fine-grained semantic relations between verbs. (iv) We present the results of a practical annotation effort leading to a gold standard resource for our relation inventory, and (v) we report results for automatic classification of our target set of fine-grained semantic relations, including presupposition. We achieve a classification performance of 55% F1-score, a 100% improvement over a best-feature baseline
Using Tree Kernels for Classifying Temporal Relations between Events
PACLIC 23 / City University of Hong Kong / 3-5 December 200
Necessitarianism and Dispositions
In this paper, I argue in favor of necessitarianism, the view that dispositions, when stimulated, necessitate their manifestations. After introducing and clarifying what necessitarianism does and does not amount to, I provide reasons to support the view that dispositions once stimulated necessitate their manifestations according to the stimulating conditions and the relevant properties at stake. In this framework, I will propose a principle of causal relevance and some conditions for the possibility of interference that allow us to avoid the use of ceteris paribus clauses. I then defend necessitarianism from recent attacks raised by, among others, Mumford and Anjum, noting that the antecedent strengthening test is a test for causal relevance that raises no difficulties for necessitarianism
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
- …