Search CORE

2,375 research outputs found

Sentiment Classification in Resource-Scarce Languages by using Label Propagation

Author: Kaji Nobuhiro
Kitsuregawa Masaru
Ren Yong
Toyoda Masashi
Yoshinaga Naoki
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment

Author: Ahn Yong-Yeol
An Jisun
Kwak Haewoon
Publication venue
Publication date: 01/01/2018
Field of study

Because word semantics can substantially change across communities and contexts, capturing domain-specific word semantics is an important challenge. Here, we propose SEMAXIS, a simple yet powerful framework to characterize word semantics using many semantic axes in word- vector spaces beyond sentiment. We demonstrate that SEMAXIS can capture nuanced semantic representations in multiple online communities. We also show that, when the sentiment axis is examined, SEMAXIS outperforms the state-of-the-art approaches in building domain-specific sentiment lexicons.Comment: Accepted in ACL 2018 as a full pape

arXiv.org e-Print Archive

Crossref

Machine learning with limited label availability: algorithms and applications

Author: GIOBERGIA FLAVIO
Publication venue: country:Italy
Publication date: 15/02/2023
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

HSAS: Hindi Subjectivity Analysis System

Author: Deepa Shenoy P.
Manjunath N.
Vandana Jha .
Venugopal K.R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

With the development of Web 2.0, we are abundant with the documents expressing user's opinions, attitudes and sentiments in the textual form. This user generated textual content is an important source of information to make sound decisions by the organizations and the government. The textual information can be categorized into two types: facts and opinions. Subjectivity analysis is the automatic extraction of subjective information from the opinions posted by users and divides the content into subjective and objective sentences. Most of the works in subjectivity analysis exists for English language data but with the introduction of unicode standards UTF-8, Hindi language content on the web is growing very rapidly. In this paper, Hindi Subjectivity Analysis System (HSAS) is proposed. It explores two different methods of generating subjectivity lexicon using the available resources in English language and their comparative evaluation in performing the task of subjectivity analysis at the sentence level. The first method uses English language OpinionFinder subjectivity lexicon. The second method uses a small seed word list of Hindi language and expands it to generate subjectivity lexicon. Different evaluation strategies are used to validate the lexicon. We achieved 71.4% agreement with human annotators and ~80% accuracy in classification on a parallel data set in English and Hindi. Extensive simulations conducted on the test dataset confirm the validity of the suggested method

ePrints@Bangalore University

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning

Author: Jiang Haiyun
Lin Hongzhan
Liu Ruifang
Luo Ziyang
Ma Jing
Shi Shuming
Yi Pengyao
Publication venue
Publication date: 29/03/2023
Field of study

The spread of rumors along with breaking events seriously hinders the truth in the era of social media. Previous studies reveal that due to the lack of annotated resources, rumors presented in minority languages are hard to be detected. Furthermore, the unforeseen breaking events not involved in yesterday's news exacerbate the scarcity of data resources. In this work, we propose a novel zero-shot framework based on prompt learning to detect rumors falling in different domains or presented in different languages. More specifically, we firstly represent rumor circulated on social media as diverse propagation threads, then design a hierarchical prompt encoding mechanism to learn language-agnostic contextual representations for both prompts and rumor data. To further enhance domain adaptation, we model the domain-invariant structural features from the propagation threads, to incorporate structural position representations of influential community response. In addition, a new virtual response augmentation method is used to improve model training. Extensive experiments conducted on three real-world datasets demonstrate that our proposed model achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.Comment: AAAI 202

arXiv.org e-Print Archive