Search CORE

17,284 research outputs found

CASCADE: Contextual Sarcasm Detection in Online Discussion Forums

Author: Cambria Erik
Gorantla Sruthi
Hazarika Devamanyu
Mihalcea Rada
Poria Soujanya
Zimmermann Roger
Publication venue
Publication date: 16/05/2018
Field of study

The literature in automated sarcasm detection has mainly focused on lexical, syntactic and semantic-level analysis of text. However, a sarcastic sentence can be expressed with contextual presumptions, background and commonsense knowledge. In this paper, we propose CASCADE (a ContextuAl SarCasm DEtector) that adopts a hybrid approach of both content and context-driven modeling for sarcasm detection in online social media discussions. For the latter, CASCADE aims at extracting contextual information from the discourse of a discussion thread. Also, since the sarcastic nature and form of expression can vary from person to person, CASCADE utilizes user embeddings that encode stylometric and personality features of the users. When used along with content-based feature extractors such as Convolutional Neural Networks (CNNs), we see a significant boost in the classification performance on a large Reddit corpus.Comment: Accepted in COLING 201

arXiv.org e-Print Archive

Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research

Author: Hazarika Devamanyu
Majumder Navonil
Mihalcea Rada
Poria Soujanya
Publication venue
Publication date: 16/11/2020
Field of study

Sentiment analysis as a field has come a long way since it was first introduced as a task nearly 20 years ago. It has widespread commercial applications in various domains like marketing, risk management, market research, and politics, to name a few. Given its saturation in specific subtasks -- such as sentiment polarity classification -- and datasets, there is an underlying perception that this field has reached its maturity. In this article, we discuss this perception by pointing out the shortcomings and under-explored, yet key aspects of this field that are necessary to attain true sentiment understanding. We analyze the significant leaps responsible for its current relevance. Further, we attempt to chart a possible course for this field that covers many overlooked and unanswered questions.Comment: Published in the IEEE Transactions on Affective Computing (TAFFC

arXiv.org e-Print Archive

Reasoning with Sarcasm by Reading In-between

Author: Hui Siu Cheung
Su Jian
Tay Yi
Tuan Luu Anh
Publication venue
Publication date: 08/05/2018
Field of study

Sarcasm is a sophisticated speech act which commonly manifests on social communities such as Twitter and Reddit. The prevalence of sarcasm on the social web is highly disruptive to opinion mining systems due to not only its tendency of polarity flipping but also usage of figurative language. Sarcasm commonly manifests with a contrastive theme either between positive-negative sentiments or between literal-figurative scenarios. In this paper, we revisit the notion of modeling contrast in order to reason with sarcasm. More specifically, we propose an attention-based neural model that looks in-between instead of across, enabling it to explicitly model contrast and incongruity. We conduct extensive experiments on six benchmark datasets from Twitter, Reddit and the Internet Argument Corpus. Our proposed model not only achieves state-of-the-art performance on all datasets but also enjoys improved interpretability.Comment: Accepted to ACL201

arXiv.org e-Print Archive

Catering to Your Concerns: Automatic Generation of Personalised Security-Centric Descriptions for Android Apps

Author: Grobler Marthie
Nepal Surya
Paris Cecile
Tang Lihong
Wen Sheng
Wu Tingmin
Xiang Yang
Zhang Rongjunchen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2020
Field of study

Android users are increasingly concerned with the privacy of their data and security of their devices. To improve the security awareness of users, recent automatic techniques produce security-centric descriptions by performing program analysis. However, the generated text does not always address users' concerns as they are generally too technical to be understood by ordinary users. Moreover, different users have varied linguistic preferences, which do not match the text. Motivated by this challenge, we develop an innovative scheme to help users avoid malware and privacy-breaching apps by generating security descriptions that explain the privacy and security related aspects of an Android app in clear and understandable terms. We implement a prototype system, PERSCRIPTION, to generate personalised security-centric descriptions that automatically learn users' security concerns and linguistic preferences to produce user-oriented descriptions. We evaluate our scheme through experiments and user studies. The results clearly demonstrate the improvement on readability and users' security awareness of PERSCRIPTION's descriptions compared to existing description generators

arXiv.org e-Print Archive

Emotion Recognition in Conversation: Research Challenges, Datasets, and Recent Advances

Author: Hovy Eduard
Majumder Navonil
Mihalcea Rada
Poria Soujanya
Publication venue
Publication date: 08/05/2019
Field of study

Emotion is intrinsic to humans and consequently emotion understanding is a key part of human-like artificial intelligence (AI). Emotion recognition in conversation (ERC) is becoming increasingly popular as a new research frontier in natural language processing (NLP) due to its ability to mine opinions from the plethora of publicly available conversational data in platforms such as Facebook, Youtube, Reddit, Twitter, and others. Moreover, it has potential applications in health-care systems (as a tool for psychological analysis), education (understanding student frustration) and more. Additionally, ERC is also extremely important for generating emotion-aware dialogues that require an understanding of the user's emotions. Catering to these needs calls for effective and scalable conversational emotion-recognition algorithms. However, it is a strenuous problem to solve because of several research challenges. In this paper, we discuss these challenges and shed light on the recent research in this field. We also describe the drawbacks of these approaches and discuss the reasons why they fail to successfully overcome the research challenges in ERC

arXiv.org e-Print Archive

A Convolutional Neural Network for Search Term Detection

Author: Aarabi Parham
Barfett Joseph
Colak Errol
Dowdell Tim
Gray Bruce
Salehinejad Hojjat
Valaee Shahrokh
Publication venue
Publication date: 07/11/2017
Field of study

Pathfinding in hospitals is challenging for patients, visitors, and even employees. Many people have experienced getting lost due to lack of clear guidance, large footprint of hospitals, and confusing array of hospital wings. In this paper, we propose Halo; An indoor navigation application based on voice-user interaction to help provide directions for users without assistance of a localization system. The main challenge is accurate detection of origin and destination search terms. A custom convolutional neural network (CNN) is proposed to detect origin and destination search terms from transcription of a submitted speech query. The CNN is trained based on a set of queries tailored specifically for hospital and clinic environments. Performance of the proposed model is studied and compared with Levenshtein distance-based word matching.Comment: This paper is accepted for presentation at 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communication

arXiv.org e-Print Archive

Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training

Author: Fung Pascale
Madotto Andrea
Park Ji Ho
Wu Chien-Sheng
Xu Peng
Publication venue
Publication date: 12/09/2018
Field of study

In this paper, we propose Emo2Vec which encodes emotional semantics into vectors. We train Emo2Vec by multi-task learning six different emotion-related tasks, including emotion/sentiment analysis, sarcasm classification, stress detection, abusive language classification, insult detection, and personality recognition. Our evaluation of Emo2Vec shows that it outperforms existing affect-related representations, such as Sentiment-Specific Word Embedding and DeepMoji embeddings with much smaller training corpora. When concatenated with GloVe, Emo2Vec achieves competitive performances to state-of-the-art results on several tasks using a simple logistic regression classifier.Comment: Accepted by 9th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis(WASSA) in EMNLP 201

arXiv.org e-Print Archive

Social Media-based Substance Use Prediction

Author: Bickel Warren K.
Ding Tao
Pan Shimei
Publication venue
Publication date: 31/05/2017
Field of study

In this paper, we demonstrate how the state-of-the-art machine learning and text mining techniques can be used to build effective social media-based substance use detection systems. Since a substance use ground truth is difficult to obtain on a large scale, to maximize system performance, we explore different feature learning methods to take advantage of a large amount of unsupervised social media data. We also demonstrate the benefit of using multi-view unsupervised feature learning to combine heterogeneous user information such as Facebook `"likes" and "status updates" to enhance system performance. Based on our evaluation, our best models achieved 86% AUC for predicting tobacco use, 81% for alcohol use and 84% for drug use, all of which significantly outperformed existing methods. Our investigation has also uncovered interesting relations between a user's social media behavior (e.g., word usage) and substance use

arXiv.org e-Print Archive

Deep Inference of Personality Traits by Integrating Image and Word Use in Social Networks

Author: Cucurull Guillem
Gonfaus Josep M.
Gonzàlez Jordi
Roca F. Xavier
Rodríguez Pau
Yazici V. Oguz
Publication venue
Publication date: 06/02/2018
Field of study

Social media, as a major platform for communication and information exchange, is a rich repository of the opinions and sentiments of 2.3 billion users about a vast spectrum of topics. To sense the whys of certain social user's demands and cultural-driven interests, however, the knowledge embedded in the 1.8 billion pictures which are uploaded daily in public profiles has just started to be exploited since this process has been typically been text-based. Following this trend on visual-based social analysis, we present a novel methodology based on Deep Learning to build a combined image-and-text based personality trait model, trained with images posted together with words found highly correlated to specific personality traits. So the key contribution here is to explore whether OCEAN personality trait modeling can be addressed based on images, here called \emph{Mind{P}ics}, appearing with certain tags with psychological insights. We found that there is a correlation between those posted images and their accompanying texts, which can be successfully modeled using deep neural networks for personality estimation. The experimental results are consistent with previous cyber-psychology results based on texts or images. In addition, classification results on some traits show that some patterns emerge in the set of images corresponding to a specific text, in essence to those representing an abstract concept. These results open new avenues of research for further refining the proposed personality model under the supervision of psychology experts

arXiv.org e-Print Archive

Self-adaptive Privacy Concern Detection for User-generated Content

Author: Jiang Lili
Vu Xuan-Son
Publication venue
Publication date: 19/06/2018
Field of study

To protect user privacy in data analysis, a state-of-the-art strategy is differential privacy in which scientific noise is injected into the real analysis output. The noise masks individual's sensitive information contained in the dataset. However, determining the amount of noise is a key challenge, since too much noise will destroy data utility while too little noise will increase privacy risk. Though previous research works have designed some mechanisms to protect data privacy in different scenarios, most of the existing studies assume uniform privacy concerns for all individuals. Consequently, putting an equal amount of noise to all individuals leads to insufficient privacy protection for some users, while over-protecting others. To address this issue, we propose a self-adaptive approach for privacy concern detection based on user personality. Our experimental studies demonstrate the effectiveness to address a suitable personalized privacy protection for cold-start users (i.e., without their privacy-concern information in training data)

arXiv.org e-Print Archive