37,292 research outputs found
Weakly-supervised appraisal analysis
This article is concerned with the computational treatment of Appraisal, a Systemic Functional Linguistic theory of the types of language employed to communicate opinion in English. The theory considers aspects such as Attitude (how writers communicate their point of view), Engagement (how writers align themselves with respect to the opinions of others) and Graduation (how writers amplify or diminish their attitudes and engagements). To analyse text according to the theory we employ a weakly-supervised approach to text classification, which involves comparing the similarity of words with prototypical examples of classes. We evaluate the method's performance using a collection of book reviews annotated according to the Appraisal theory
Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell
This paper presents an approach for investigating the nature of semantic
information captured by word embeddings. We propose a method that extends an
existing human-elicited semantic property dataset with gold negative examples
using crowd judgments. Our experimental approach tests the ability of
supervised classifiers to identify semantic features in word embedding vectors
and com- pares this to a feature-identification method based on full vector
cosine similarity. The idea behind this method is that properties identified by
classifiers, but not through full vector comparison are captured by embeddings.
Properties that cannot be identified by either method are not. Our results
provide an initial indication that semantic properties relevant for the way
entities interact (e.g. dangerous) are captured, while perceptual information
(e.g. colors) is not represented. We conclude that, though preliminary, these
results show that our method is suitable for identifying which properties are
captured by embeddings.Comment: Accepted to the EMNLP workshop "Analyzing and interpreting neural
networks for NLP
Designing Semantic Kernels as Implicit Superconcept Expansions
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of āsemantic smoothing kernelsā by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel
The distractor frequency effect in pictureāword interference: evidence for response exclusion
In 3 experiments, subjects named pictures with low- or high-frequency superimposed distractor words. In a 1st experiment, we replicated the finding that low-frequency words induce more interference in picture naming than high-frequency words (i.e., distractor frequency effect; Miozzo & Caramazza, 2003). According to the response exclusion hypothesis, this effect has its origin at a postlexical stage and is related to a response buffer. The account predicts that the distractor frequency effect should only be present when a response to the word enters the response buffer. This was tested by masking the distractor (Experiment 2) and by presenting it at various time points before stimulus onset (Experiment 3). Results supported the hypothesis by showing that the effect was only present when distractors were visible, and if they were presented in close proximity to the target picture. These results have implications for the models of lexical access and for the tasks that can be used to study this process
Using distributional similarity to organise biomedical terminology
We investigate an application of distributional similarity techniques to the problem of structural organisation of biomedical terminology. Our application domain is the relatively small GENIA corpus. Using terms that have been accurately marked-up by hand within the corpus, we consider the problem of automatically determining semantic proximity. Terminological units are dened for our purposes as normalised classes of individual terms. Syntactic analysis of the corpus data is carried out using the Pro3Gres parser and provides the data required to calculate distributional similarity using a variety of dierent measures. Evaluation is performed against a hand-crafted gold standard for this domain in the form of the GENIA ontology. We show that distributional similarity can be used to predict semantic type with a good degree of accuracy
- ā¦