7,248 research outputs found
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
The emotional recall task : juxtaposing recall and recognition-based affect scales
Existing affect scales typically involve recognition of emotions from a predetermined emotion checklist. However, a recognition-based checklist may fail to capture sufficient breadth and specificity of an individual’s recalled emotional experiences and may therefore miss emotions that frequently come to mind. More generally, how do recalled emotions differ from recognized emotions? To address these issues, we present and evaluate an affect scale based on recalled emotions. Participants are asked to produce 10 words that best described their emotions over the past month and then to rate each emotion for how often it was experienced. We show that average weighted valence of the words produced in this task, the Emotional Recall Task (ERT), is strongly correlated with scales related to general affect, such as the PANAS, Ryff’s Scales of Psychological Well-being, the Satisfaction with Life Scale, Depression Anxiety and Stress Scales, and a few other related scales. We further show that the Emotional Recall Task captures a breadth and specificity of emotions not available in other scales but that are nonetheless commonly reported as experienced emotions. We test a general version of the ERT (the ERT general) that is language neutral and can be used across cultures. Finally, we show that the ERT is valid in a test-retest paradigm. In sum, the ERT measures affect based on emotion terms relevant to an individual’s idiosyncratic experience. It is consistent with recognition-based scales, but also offers a new direction towards enriching our understanding of individual differences in recalled and recognized emotions
Discriminative Topic Mining via Category-Name Guided Text Embedding
Mining a set of meaningful and distinctive topics automatically from massive
text corpora has broad applications. Existing topic models, however, typically
work in a purely unsupervised way, which often generate topics that do not fit
users' particular needs and yield suboptimal performance on downstream tasks.
We propose a new task, discriminative topic mining, which leverages a set of
user-provided category names to mine discriminative topics from text corpora.
This new task not only helps a user understand clearly and distinctively the
topics he/she is most interested in, but also benefits directly keyword-driven
classification tasks. We develop CatE, a novel category-name guided text
embedding method for discriminative topic mining, which effectively leverages
minimal user guidance to learn a discriminative embedding space and discover
category representative terms in an iterative manner. We conduct a
comprehensive set of experiments to show that CatE mines high-quality set of
topics guided by category names only, and benefits a variety of downstream
applications including weakly-supervised classification and lexical entailment
direction identification.Comment: WWW 2020. (Code: https://github.com/yumeng5/CatE
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
Empirical Methodology for Crowdsourcing Ground Truth
The process of gathering ground truth data through human annotation is a
major bottleneck in the use of information extraction methods for populating
the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the
attempt to solve the issues related to volume of data and lack of annotators.
Typically these practices use inter-annotator agreement as a measure of
quality. However, in many domains, such as event detection, there is ambiguity
in the data, as well as a multitude of perspectives of the information
examples. We present an empirically derived methodology for efficiently
gathering of ground truth data in a diverse set of use cases covering a variety
of domains and annotation tasks. Central to our approach is the use of
CrowdTruth metrics that capture inter-annotator disagreement. We show that
measuring disagreement is essential for acquiring a high quality ground truth.
We achieve this by comparing the quality of the data aggregated with CrowdTruth
metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical
Relation Extraction, Twitter Event Identification, News Event Extraction and
Sound Interpretation. We also show that an increased number of crowd workers
leads to growth and stabilization in the quality of annotations, going against
the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
- …