Search CORE

155 research outputs found

Personalized sentiment classification based on latent individuality of microblog users

Author: FENG Shi
GAO Wei
SONG Kaisong
WANG Daling
WONG Kam-Fai
YU Ge
Publication venue: AAAI Press
Publication date: 01/01/2015
Field of study

Sentiment expression in microblog posts often re-flects user’s specific individuality due to different language habit, personal character, opinion bias and so on. Existing sentiment classification algo-rithms largely ignore such latent personal distinc-tions among different microblog users. Meanwhile, sentiment data of microblogs are sparse for indi-vidual users, making it infeasible to learn effective personalized classifier. In this paper, we propose a novel, extensible personalized sentiment classi-fication method based on a variant of latent fac-tor model to capture personal sentiment variations by mapping users and posts into a low-dimensional factor space. We alleviate the sparsity of personal texts by decomposing the posts into words which are further represented by the weighted sentiment and topic units based on a set of syntactic units of words obtained from dependency parsing results. To strengthen the representation of users, we lever-age users following relation to consolidate the in-dividuality of a user fused from other users with similar interests. Results on real-world microblog datasets confirm that our method outperforms state-of-the-art baseline algorithms with large margins.

CiteSeerX

Institutional Knowledge at Singapore Management University

On the “Easy” Task of Evaluating Chinese Irony Detection

Author: Chersoni Emmanuele
Huang Chu-Ren
Li An-Ran
Lu Qin
Xiang Rong
Publication venue: Waseda Institute for the Study of Language and Information
Publication date: 01/01/2019
Field of study

Waseda University Repository

Text segmentation techniques: A critical review

Author: Pak Irina *
Teh Phoey Lee *
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/11/2017
Field of study

Text segmentation is widely used for processing text. It is a method of splitting a document into smaller parts, which is usually called segments. Each segment has its relevant meaning. Those segments categorized as word, sentence, topic, phrase or any information unit depending on the task of the text analysis. This study presents various reasons of usage of text segmentation for different analyzing approaches. We categorized the types of documents and languages used. The main contribution of this study includes a summarization of 50 research papers and an illustration of past decade (January 2007- January 2017)’s of research that applied text segmentation as their main approach for analysing text. Results revealed the popularity of using text segmentation in different languages. Besides that, the “word” seems to be the most practical and usable segment, as it is the smaller unit than the phrase, sentence or line

Crossref

Sunway Institutional Repository

Featuring, Detecting, and Visualizing Human Sentiment in Chinese Micro-Blog

Author: Chen Liming (Luke)
Guo Bin
Li Wenjie
Wang zhitao
Yu Zhiwen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

2015-2016 > Academic research: refereed > Publication in refereed journa

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

De Montfort University Open Research Archive

Ulster University's Research Portal

Automated Social Text Annotation With Joint Multilabel Attention Networks

Author: Coenen Frans
Dong Hang
Huang Kaizhu
Wang Wei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Automated social text annotation is the task of suggesting a set of tags for shared documents on social media platforms. The automated annotation process can reduce users' cognitive overhead in tagging and improve tag management for better search, browsing, and recommendation of documents. It can be formulated as a multilabel classification problem. We propose a novel deep learning-based method for this problem and design an attention-based neural network with semantic-based regularization, which can mimic users' reading and annotation behavior to formulate better document representation, leveraging the semantic relations among labels. The network separately models the title and the content of each document and injects an explicit, title-guided attention mechanism into each sentence. To exploit the correlation among labels, we propose two semantic-based loss regularizers, i.e., similarity and subsumption, which enforce the output of the network to conform to label semantics. The model with the semantic-based loss regularizers is referred to as the joint multilabel attention network (JMAN). We conducted a comprehensive evaluation study and compared JMAN to the state-of-the-art baseline models, using four large, real-world social media data sets. In terms of F 1 , JMAN significantly outperformed bidirectional gated recurrent unit (Bi-GRU) relatively by around 12.8%-78.6% and the hierarchical attention network (HAN) by around 3.9%-23.8%. The JMAN model demonstrates advantages in convergence and training speed. Further improvement of performance was observed against latent Dirichlet allocation (LDA) and support vector machine (SVM). When applying the semantic-based loss regularizers, the performance of HAN and Bi-GRU in terms of F 1 was also boosted. It is also found that dynamic update of the label semantic matrices (JMAN d ) has the potential to further improve the performance of JMAN but at the cost of substantial memory and warrants further study

University of Liverpool Repository

Crossref

Edinburgh Research Explorer

Oxford University Research Archive

Blog Style Classification: Refining Affective Blogs

Author: Bielikova Maria
Simko Marian
Virik Martin
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 07/02/2017
Field of study

In the constantly growing blogosphere with no restrictions on form or topic, a number of writing styles and genres have emerged. Recognition and classification of these styles has become significant for information processing with an aim to improve blog search or sentiment mining. One of the main issues in this field is detection of informative and affective articles. However, such differentiation does not suffice today. In this paper we extend the differentiation and suggest a fine-grained set of subcategories for affective articles. We propose and evaluate a classification method employing novel lexical, morphological, lightweight syntactic and structural features of written text. The results show that our method outperforms the existing approaches

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)