Search CORE

4 research outputs found

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

Author: Ai Wei
Davidov Dmitry
Hermans M.
Hu Tianran
Liu Kun-Lin
Pennington Jeffrey
Vaswani Ashish
Yosinski Jason
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2019
Field of study

Sentiment analysis has various application scenarios in software engineering (SE), such as detecting developers' emotions in commit messages and identifying their opinions on Q&A forums. However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason. Then, researchers have to utilize labeled SE-related texts to customize sentiment analysis for SE tasks via a variety of algorithms. However, the scarce labeled data can cover only very limited expressions and thus cannot guarantee the analysis quality. To address such a problem, we turn to the easily available emoji usage data for help. More specifically, we employ emotional emojis as noisy labels of sentiments and propose a representation learning approach that uses both Tweets and GitHub posts containing emojis to learn sentiment-aware representations for SE-related texts. These emoji-labeled posts can not only supply the technical jargon, but also incorporate more general sentiment patterns shared across domains. They as well as labeled data are used to learn the final sentiment classifier. Compared to the existing sentiment analysis methods used in SE, the proposed approach can achieve significant improvement on representative benchmark datasets. By further contrast experiments, we find that the Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource, but try to transform knowledge from the open domain through ubiquitous signals such as emojis.Comment: Accepted by the 2019 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Please include ESEC/FSE in any citation

arXiv.org e-Print Archive

Crossref

Individual differences limit predicting well-being and productivity using software repositories : a longitudinal industrial study

Author: Adams Bram
Claes Maelick
Elovainio Marko
Kuutila Miikka
Mantyla Mika
Publication venue
Publication date: 01/09/2021
Field of study

Reports of poor work well-being and fluctuating productivity in software engineering have been reported in both academic and popular sources. Understanding and predicting these issues through repository analysis might help manage software developers' well-being. Our objective is to link data from software repositories, that is commit activity, communication, expressed sentiments, and job events, with measures of well-being obtained with a daily experience sampling questionnaire. To achieve our objective, we studied a single software project team for eight months in the software industry. Additionally, we performed semi-structured interviews to explain our results. The acquired quantitative data are analyzed with generalized linear mixed-effects models with autocorrelation structure. We find that individual variance accounts for most of the R-2 values in models predicting developers' experienced well-being and productivity. In other words, using software repository variables to predict developers' well-being or productivity is challenging due to individual differences. Prediction models developed for each developer individually work better, with fixed effects R-2 value of up to 0.24. The semi-structured interviews give insights into the well-being of software developers and the benefits of chat interaction. Our study suggests that individualized prediction models are needed for well-being and productivity prediction in software development.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Opinion Mining for Software Development: A Systematic Literature Review

Author: Alexander Serebrenik
Bin Lin
Gabriele Bavota
Michele Lanza
Nathan Cassee
Nicole Novielli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Bari

On the use of emoticons in open source software development

Author: Davidov D.
Hogenboom A.
Marvin L.-E.
Novak P. K.
Read J.
Witmer D. F.
Publication venue
Publication date: 01/01/2018
Field of study

Abstract Background: Using sentiment analysis to study software developers’ behavior comes with challenges such as the presence of a large amount of technical discussion unlikely to express any positive or negative sentiment. However, emoticons provide information about developer sentiments that can easily be extracted from software repositories. Aim: We investigate how software developers use emoticons differently in issue trackers in order to better understand the differences between developers and determine to which extent emoticons can be used as in place of sentiment analysis. Method: We extract emoticons from 1.3M comments from Apache’s issue tracker and 4.5M from Mozilla’s issue tracker using regular expressions built from a list of emoticons used by SentiStrength and Wikipedia. We check for statistical differences using Mann-Whitney U tests and determine the effect size with Cliff’s δ. Results: Overall Mozilla developers rely more on emoticons than Apache developers. While the overall rate of comments with emoticons is of 1% and 3% for Apache and Mozilla, some individual developers can have a rate up to 21%. Looking specifically at Mozilla developers, we find that western developers use significantly more emoticons (with medium size effect) than eastern developers. While the majority of emoticons are used to express joy, we find that Mozilla developers use emoticons more frequently to express sadness and surprise than Apache developers. Finally, we find that Apache developers use overall more emoticons during weekends than during weekdays, with the share of sad and surprised emoticons increasing during weekends. Conclusions: While emoticons are primarily used to express joy, the more occasional use of sad and surprised emoticons can potentially be utilized to detect frustration in place of sentiment analysis among developers using emoticons frequently enough

arXiv.org e-Print Archive

Crossref

University of Oulu Repository - Jultika