Search CORE

17,143 research outputs found

On negative results when using sentiment analysis tools for software engineering research

Author: AI Rousinopoulos
AJ Viera
Alexander Serebrenik
B Pang
B Vasilescu
C Giraud-Carrier
DW Zimmerman
E Brunner
F Konietschke
F Wilcoxon
FJ Shull
G Destefanis
J Cohen
JL Fleiss
KR Gabriel
L Hubert
M Hall
M Thelwall
M Thelwall
OJ Dunn
P Pritchard
P Tonella
Proshanta Sarkar
Robbert Jongeling
Subhajit Datta
WM Rand
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact of the choice of a sentiment analysis tool on software engineering studies by conducting a simple study of differences in issue resolution times for positive, negative and neutral texts. We repeat the study for seven datasets (issue trackers and Stack Overflow questions) and different sentiment analysis tools and observe that the disagreement between the tools can lead to diverging conclusions. Finally, we perform two replications of previously published studies and observe that the results of those studies cannot be confirmed when a different sentiment analysis tool is used

Repository TU/e

Crossref

Springer - Publisher Connector

Pure OAI Repository

Institutional Knowledge at Singapore Management University

Merging datasets for emotion analysis

Author: de Arriba Serra Ariadna
Franch Gutiérrez Javier
Oriol Hilari Marc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Context. Applying sentiment analysis is in general a laborious task. Furthermore, if we add the task of getting a good quality dataset with balanced distribution and enough samples, the job becomes more complicated. Objective. We want to find out whether merging compatible datasets improves emotion analysis based on machine learning (ML) techniques, compared to the original, individual datasets. Method. We obtained two datasets with Covid-19-related tweets written in Spanish, and then built from them two new datasets combining the original ones with different consolidation of balance. We analyzed the results according to precision, recall, F1-score and accuracy. Results. The results obtained show that merging two datasets can improve the performance of ML models, particularly the F1-score, when the merging process follows a strategy that optimizes the balance of the resulting dataset. Conclusions. Merging two datasets can improve the performance of ML models for emotion analysis, whilst saving resources for labeling training data. This might be especially useful for several software engineering activities that leverage on ML-based emotion analysis techniques.This paper has been funded by the Spanish Ministerio de Ciencia e Innovación under project / funding scheme PID2020-117191RB.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Engineers, Aware! Commercial Tools Disagree on Social Media Sentiment : Analyzing the Sentiment Bias of Four Major Tools

Author: Jansen Bernard J.
Jung Soon-Gyo
Salminen Joni
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/06/2022
Field of study

Large commercial sentiment analysis tools are often deployed in software engineering due to their ease of use. However, it is not known how accurate these tools are, and whether the sentiment ratings given by one tool agree with those given by another tool. We use two datasets - (1) NEWS consisting of 5,880 news stories and 60K comments from four social media platforms: Twitter, Instagram, YouTube, and Facebook; and (2) IMDB consisting of 7,500 positive and 7,500 negative movie reviews - to investigate the agreement and bias of four widely used sentiment analysis (SA) tools: Microsoft Azure (MS), IBM Watson, Google Cloud, and Amazon Web Services (AWS). We find that the four tools assign the same sentiment on less than half (48.1%) of the analyzed content. We also find that AWS exhibits neutrality bias in both datasets, Google exhibits bi-polarity bias in the NEWS dataset but neutrality bias in the IMDB dataset, and IBM and MS exhibit no clear bias in the NEWS dataset but have bi-polarity bias in the IMDB dataset. Overall, IBM has the highest accuracy relative to the known ground truth in the IMDB dataset. Findings indicate that psycholinguistic features - especially affect, tone, and use of adjectives - explain why the tools disagree. Engineers are urged caution when implementing SA tools for applications, as the tool selection affects the obtained sentiment labels.© Owner/Author(s). ACM 2022. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the ACM on Human-Computer Interaction, https://doi.org/10.1145/3532203.fi=vertaisarvioitu|en=peerReviewed

Osuva

EmoTxt: A Toolkit for Emotion Recognition from Text

Author: Calefato Fabio
Lanubile Filippo
Novielli Nicole
Publication venue
Publication date: 01/01/2017
Field of study

We present EmoTxt, a toolkit for emotion recognition from text, trained and tested on a gold standard of about 9K question, answers, and comments from online interactions. We provide empirical evidence of the performance of EmoTxt. To the best of our knowledge, EmoTxt is the first open-source toolkit supporting both emotion recognition from text and training of custom emotion classification models.Comment: In Proc. 7th Affective Computing and Intelligent Interaction (ACII'17), San Antonio, TX, USA, Oct. 23-26, 2017, p. 79-80, ISBN: 978-1-5386-0563-

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Bari

SemEval-2016 task 5 : aspect based sentiment analysis

Author: Al-Ayyoub Mahmoud
AL-Smadi Mohammed
Androutsopoulos Ion
Apidianaki Marianna
Bel Núria
De Clercq Orphée
Eryiğit Gülşen
Galanis Dimitris
Hoste Veronique
Jiménez-Zafra Salud Maria
Kotelnikov Evgeniy
Loukachevitch Natalia
Manandhar Suresh
Papageorgiou Haris
Pontiki Maria
Qin Bing
Tannier Xavier
Zhao Yanyan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

International audienceThis paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams

Crossref

Ghent University Academic Bibliography