2,375 research outputs found
SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment
Because word semantics can substantially change across communities and
contexts, capturing domain-specific word semantics is an important challenge.
Here, we propose SEMAXIS, a simple yet powerful framework to characterize word
semantics using many semantic axes in word- vector spaces beyond sentiment. We
demonstrate that SEMAXIS can capture nuanced semantic representations in
multiple online communities. We also show that, when the sentiment axis is
examined, SEMAXIS outperforms the state-of-the-art approaches in building
domain-specific sentiment lexicons.Comment: Accepted in ACL 2018 as a full pape
Machine learning with limited label availability: algorithms and applications
L'abstract è presente nell'allegato / the abstract is in the attachmen
HSAS: Hindi Subjectivity Analysis System
With the development of Web 2.0, we are abundant with the documents expressing user's opinions, attitudes and sentiments in the textual form. This user generated textual content is an important source of information to make sound decisions by the organizations and the government. The textual information can be categorized into two types: facts and opinions. Subjectivity analysis is the automatic extraction of subjective information from the opinions posted by users and divides the content into subjective and objective sentences. Most of the works in subjectivity analysis exists for English language data but with the introduction of unicode standards UTF-8, Hindi language content on the web is growing very rapidly. In this paper, Hindi Subjectivity Analysis System (HSAS) is proposed. It explores two different methods of generating subjectivity lexicon using the available resources in English language and their comparative evaluation in performing the task of subjectivity analysis at the sentence level. The first method uses English language OpinionFinder subjectivity lexicon. The second method uses a small seed word list of Hindi language and expands it to generate subjectivity lexicon. Different evaluation strategies are used to validate the lexicon. We achieved 71.4% agreement with human annotators and ~80% accuracy in classification on a parallel data set in English and Hindi. Extensive simulations conducted on the test dataset confirm the validity of the suggested method
Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning
The spread of rumors along with breaking events seriously hinders the truth
in the era of social media. Previous studies reveal that due to the lack of
annotated resources, rumors presented in minority languages are hard to be
detected. Furthermore, the unforeseen breaking events not involved in
yesterday's news exacerbate the scarcity of data resources. In this work, we
propose a novel zero-shot framework based on prompt learning to detect rumors
falling in different domains or presented in different languages. More
specifically, we firstly represent rumor circulated on social media as diverse
propagation threads, then design a hierarchical prompt encoding mechanism to
learn language-agnostic contextual representations for both prompts and rumor
data. To further enhance domain adaptation, we model the domain-invariant
structural features from the propagation threads, to incorporate structural
position representations of influential community response. In addition, a new
virtual response augmentation method is used to improve model training.
Extensive experiments conducted on three real-world datasets demonstrate that
our proposed model achieves much better performance than state-of-the-art
methods and exhibits a superior capacity for detecting rumors at early stages.Comment: AAAI 202
- …