Search CORE

11 research outputs found

A Benchmark Study on Sentiment Analysis for Software Engineering Research

Author: Ahmed T.
Barbieri F.
Danescu C.
Serebrenik A.
Snow R.
Viera A.J.
Wilson T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

A recent research trend has emerged to identify developers' emotions, by applying sentiment analysis to the content of communication traces left in collaborative development environments. Trying to overcome the limitations posed by using off-the-shelf sentiment analysis tools, researchers recently started to develop their own tools for the software engineering domain. In this paper, we report a benchmark study to assess the performance and reliability of three sentiment analysis tools specifically customized for software engineering. Furthermore, we offer a reflection on the open challenges, as they emerge from a qualitative analysis of misclassified texts.Comment: Proceedings of 15th International Conference on Mining Software Repositories (MSR 2018

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Bari

Can We Use SE-specific Sentiment Analysis Tools in a Cross-Platform Setting?

Author: Araujo Blaz Cássio Castaldi
Calefato Fabio
Gachechiladze Daviti
Islam Md Rakibul
Ortu Marco
Ortu Marco
Socher Richard
Taboada Maite
Tantithamthavorn Chakkrit
Uddin Gias
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2020
Field of study

In this paper, we address the problem of using sentiment analysis tools 'off-the-shelf,' that is when a gold standard is not available for retraining. We evaluate the performance of four SE-specific tools in a cross-platform setting, i.e., on a test set collected from data sources different from the one used for training. We find that (i) the lexicon-based tools outperform the supervised approaches retrained in a cross-platform setting and (ii) retraining can be beneficial in within-platform settings in the presence of robust gold standard datasets, even using a minimal training set. Based on our empirical findings, we derive guidelines for reliable use of sentiment analysis tools in software engineering.Comment: 12 page

arXiv.org e-Print Archive

Crossref

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

Author: Ai Wei
Davidov Dmitry
Hermans M.
Hu Tianran
Liu Kun-Lin
Pennington Jeffrey
Vaswani Ashish
Yosinski Jason
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2019
Field of study

Sentiment analysis has various application scenarios in software engineering (SE), such as detecting developers' emotions in commit messages and identifying their opinions on Q&A forums. However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason. Then, researchers have to utilize labeled SE-related texts to customize sentiment analysis for SE tasks via a variety of algorithms. However, the scarce labeled data can cover only very limited expressions and thus cannot guarantee the analysis quality. To address such a problem, we turn to the easily available emoji usage data for help. More specifically, we employ emotional emojis as noisy labels of sentiments and propose a representation learning approach that uses both Tweets and GitHub posts containing emojis to learn sentiment-aware representations for SE-related texts. These emoji-labeled posts can not only supply the technical jargon, but also incorporate more general sentiment patterns shared across domains. They as well as labeled data are used to learn the final sentiment classifier. Compared to the existing sentiment analysis methods used in SE, the proposed approach can achieve significant improvement on representative benchmark datasets. By further contrast experiments, we find that the Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource, but try to transform knowledge from the open domain through ubiquitous signals such as emojis.Comment: Accepted by the 2019 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Please include ESEC/FSE in any citation

arXiv.org e-Print Archive

Crossref

Use and misuse of the term "Experiment" in mining software repositories research

Author: Ayala Martínez Claudia Patricia
Franch Gutiérrez Javier
Juristo Juzgado Natalia
Turhan Burak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2022
Field of study

The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Opinion Mining for Software Development: A Systematic Literature Review

Author: Alexander Serebrenik
Bin Lin
Gabriele Bavota
Michele Lanza
Nathan Cassee
Nicole Novielli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Bari