20 research outputs found

    SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

    Full text link
    Sentiment analysis has various application scenarios in software engineering (SE), such as detecting developers' emotions in commit messages and identifying their opinions on Q&A forums. However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason. Then, researchers have to utilize labeled SE-related texts to customize sentiment analysis for SE tasks via a variety of algorithms. However, the scarce labeled data can cover only very limited expressions and thus cannot guarantee the analysis quality. To address such a problem, we turn to the easily available emoji usage data for help. More specifically, we employ emotional emojis as noisy labels of sentiments and propose a representation learning approach that uses both Tweets and GitHub posts containing emojis to learn sentiment-aware representations for SE-related texts. These emoji-labeled posts can not only supply the technical jargon, but also incorporate more general sentiment patterns shared across domains. They as well as labeled data are used to learn the final sentiment classifier. Compared to the existing sentiment analysis methods used in SE, the proposed approach can achieve significant improvement on representative benchmark datasets. By further contrast experiments, we find that the Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource, but try to transform knowledge from the open domain through ubiquitous signals such as emojis.Comment: Accepted by the 2019 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Please include ESEC/FSE in any citation

    Emotion-based analysis of programming languages on Stack Overflow

    Get PDF
    When developing a software engineering project, selecting the most appropriate programming language is a crucial step. Most often, feeling at ease with the possible options becomes almost as relevant as the technical features of the language. Therefore, it appears to be worth analyzing the role that the emotional component plays in this process. In this article, we analyze the trend of the emotions expressed by developers in 2018 on the Stack Overflow platform in posts concerning 26 programming languages. To do so, we propose a learning model trained by distant supervision and the comparison of two different classifier architectures

    Analysis of Human Affect and Bug Patterns to Improve Software Quality and Security

    Get PDF
    The impact of software is ever increasing as more and more systems are being software operated. Despite the usefulness of software, many instances software failures have been causing tremendous losses in lives and dollars. Software failures take place because of bugs (i.e., faults) in the software systems. These bugs cause the program to malfunction or crash and expose security vulnerabilities exploitable by malicious hackers. Studies confirm that software defects and vulnerabilities appear in source code largely due to the human mistakes and errors of the developers. Human performance is impacted by the underlying development process and human affects, such as sentiment and emotion. This thesis examines these human affects of software developers, which have drawn recent interests in the community. For capturing developers’ sentimental and emotional states, we have developed several software tools (i.e., SentiStrength-SE, DEVA, and MarValous). These are novel tools facilitating automatic detection of sentiments and emotions from the software engineering textual artifacts. Using such an automated tool, the developers’ sentimental variations are studied with respect to the underlying development tasks (e.g., bug-fixing, bug-introducing), development periods (i.e., days and times), team sizes and project sizes. We expose opportunities for exploiting developers’ sentiments for higher productivity and improved software quality. While developers’ sentiments and emotions can be leveraged for proactive and active safeguard in identifying and minimizing software bugs, this dissertation also includes in-depth studies of the relationship among various bug patterns, such as software defects, security vulnerabilities, and code smells to find actionable insights in minimizing software bugs and improving software quality and security. Bug patterns are exposed through mining software repositories and bug databases. These bug patterns are crucial in localizing bugs and security vulnerabilities in software codebase for fixing them, predicting portions of software susceptible to failure or exploitation by hackers, devising techniques for automated program repair, and avoiding code constructs and coding idioms that are bug-prone. The software tools produced from this thesis are empirically evaluated using standard measurement metrics (e.g., precision, recall). The findings of all the studies are validated with appropriate tests for statistical significance. Finally, based on our experience and in-depth analysis of the present state of the art, we expose avenues for further research and development towards a holistic approach for developing improved and secure software systems

    Mining software repositories: measuring effectiveness and affectiveness in software systems.

    Get PDF
    Software Engineering field has many goals, among them we can certainly deal with monitoring and controlling the development process in order to meet the business requirements of the released software artifact. Software engineers need to have empirical evidence that the development process and the overall quality of software artifacts is converging to the required features. Improving the development process's Effectiveness leads to higher productivity, meaning shorter time to market, but understanding or even measuring the software de- velopment process is an hard challenge. Modern software is the result of a complex process involving many stakeholders such as product owners, quality assurance teams, project manager and, above all, developers. All these stake- holders use complex software systems for managing development process, issue tracking, code versioning, release scheduling and many other aspect concerning software development. Tools for project management and issues/bugs tracking are becoming useful for governing the development process of Open Source soft- ware. Such tools simplify the communications process among developers and ensure the scalability of a project. The more information developers are able to exchange, the clearer are the goals, and the higher is the number of developers keen on joining and actively collaborating on a project. By analyzing data stored in such systems, researchers are able to study and address questions such as: Which are the factors able to impact the software productivity? Is it possible to improve software productivity shortening the time to market?. The present work addresses two major aspect of software development pro- cess: Effectiveness and Affectiveness. By analyzing data stored in project man- agement and in issue tracking system of Open Source Communities, we mea- sured the Effectiveness as the time required to resolve an issue and analyzed factors able to impact it

    Mining software repositories: measuring effectiveness and affectiveness in software systems.

    Get PDF
    Software Engineering field has many goals, among them we can certainly deal with monitoring and controlling the development process in order to meet the business requirements of the released software artifact. Software engineers need to have empirical evidence that the development process and the overall quality of software artifacts is converging to the required features. Improving the development process's Effectiveness leads to higher productivity, meaning shorter time to market, but understanding or even measuring the software de- velopment process is an hard challenge. Modern software is the result of a complex process involving many stakeholders such as product owners, quality assurance teams, project manager and, above all, developers. All these stake- holders use complex software systems for managing development process, issue tracking, code versioning, release scheduling and many other aspect concerning software development. Tools for project management and issues/bugs tracking are becoming useful for governing the development process of Open Source soft- ware. Such tools simplify the communications process among developers and ensure the scalability of a project. The more information developers are able to exchange, the clearer are the goals, and the higher is the number of developers keen on joining and actively collaborating on a project. By analyzing data stored in such systems, researchers are able to study and address questions such as: Which are the factors able to impact the software productivity? Is it possible to improve software productivity shortening the time to market?. The present work addresses two major aspect of software development pro- cess: Effectiveness and Affectiveness. By analyzing data stored in project man- agement and in issue tracking system of Open Source Communities, we mea- sured the Effectiveness as the time required to resolve an issue and analyzed factors able to impact it

    Learning to Detect Human Emotions in Digital World by Integrating Ensemble Voting Classifiers

    Get PDF
    Due to the expansion of world of the internet and the quick acceptance of platforms for social media, information is now able to exchange in ways never previously imagined in history of mankind. A social networking site like Twitter offers a forum where people may interact, discuss, as well as respond to specific issues via short entries, like tweets of 140 characters and fewer. Users may engage by utilizing the comment, like and share tabs on texts, videos, images and other content. Although platforms for social media are now so extensively utilized, individuals are creating as well as sharing so much information than shared before, which can be incorrect or unconnected to reality. It is difficult to identify erroneous or inaccurate statements in textual content autonomously and find emotions of people. In this paper, we suggest an Ensemble method for sentiment and emotion analysis. Different textual features of actual and Emotion and sentiment have been utilized. We used a publicly accessible dataset of twitter sentiment analysis that included total 48,247 authenticated tweets out of 23,947 of which were authentic positive texts labeled as binary 0s  and 24,300 of which were  negative texts labeled as binary 1s. In order to assess our approach, we used well-known (ML) machine learning techniques, these are Logistic Regression (LR), AdaBoost, Decision Tree (DT), SGD, XG-Boost as well as Naive Bayes. In order to get more accurate findings, we created a multi-model sentiment and emotion analyzing system utilizing the ensemble approach and the classifiers stated above. Our recommended ensemble learner method outperforms individual learners, according to an experimental study

    Effects of Personality Traits and Emotional Factors in Pull Request Acceptance.

    Get PDF
    Social interactions in the form of discussion are an indispensable part of collaborative software development. The discussions are essential for developers to share their views and to form a strong relationship with other teammates. These discussions invoke both positive and negative emotions such as joy, love, aggression, and disgust. Additionally, developers also exhibit hidden behaviours that dictate their personality. Some developers can be supportive and open to new ideas, whereas others can be conservative. Past research has shown that the personality of the developers has a significant role in determining the success of the task they collaboratively perform. Additionally, previous research has also shown that in online collaborative environments, the developers use signals from comments such as rudeness to determine if they are compatible to work together. Most of these studies use traditional small-scale surveys for their experiments. The transparent nature of online collaborative environments makes it easier to conduct empirical experiments by mining pull request comments. In this thesis, first, we investigate the effect of different personality traits on pull request acceptance. The results of this experiment will provide us with a valuable understanding of the personality traits of developers and help us develop tools to assist developers. We follow it with a second experiment to understand the influence of different emotional factors on pull request decisions. The emotion expressed by a developer on their teammates can be influenced by social statuses, such as the number of followers. Moreover, the teammate's team status, such as team member or outside contributor too, can influence the emotional effect. To understand moderation, we investigate different interaction effects. We start the experiment by replicating Tsay et al.'s work that examined the influence of social factors (e.g., `social distance') and technical factors (e.g., test file inclusion) for evaluating contributions. We extend their work by augmenting it with personality traits of developers and examining the influence of on the pull request evaluation process in GitHub. In particular, we extract the `Big Five' personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) of developers from their online digital footprints, such as pull request comments. We analyze the personality traits of 16,935 active developers from 1,860 projects and compare their relative importance to other non-personality factors from past research, in the pull request evaluation process. We find that pull requests from authors (requesters) who are more open and conscientious, but less extroverted, have a higher chance of approval. Furthermore, pull requests that are closed by developers (closers) who are more conscientious, extroverted, and neurotic, have a higher likelihood of acceptance. The larger the difference in personality traits between the requester and the closer, the more positive effect it has on pull request acceptance. Although the effect of personality traits is significant and comparable to technical factors, we find that social factors are still more influential when it comes to the effect in the likelihood of pull request acceptance. We perform a second experiment to analyze the effect of emotions on pull request decisions. To predict emotions in the comments, we develop a generalised, software engineering specific language model that outperforms previous machine learning algorithms on four different standard datasets. We find that the percentage of positive comments from both requester and closer has a positive association with pull request acceptance, whereas the percentage of negative comments has a negative association. Also, the polarity of the emotion associated with the first comment of both requester and closer had a positive association with pull request acceptance, i.e., more positive the emotion, the higher the likelihood of acceptance. Finally, we find that social factors moderate the effects of emotions
    corecore