7 research outputs found

    Arsonists or firefighters? Affectiveness in agile software development

    Get PDF
    In this paper, we present an analysis of more than 500K comments from open-source repositories of software systems developed using agile methodologies. Our aim is to empirically determine how developers interact with each other under certain psychological conditions generated by politeness, sentiment and emotion expressed within developers' comments. Developers involved in an open-source projects do not usually know each other; they mainly communicate through mailing lists, chat, and tools such as issue tracking systems. The way in which they communicate a ects the development process and the productivity of the people involved in the project. We evaluated politeness, sentiment and emotions of comments posted by agile developers and studied the communication ow to understand how they interacted in the presence of impolite and negative comments (and vice versa). Our analysis shows that \ re ghters" prevail. When in presence of impolite or negative comments, the probability of the next comment being impolite or negative is 13% and 25%, respectively; ANGER however, has a probability of 40% of being followed by a further ANGER comment. The result could help managers take control the development phases of a system, since social aspects can seriously a ect a developer's productivity. In a distributed agile environment this may have a particular resonance

    How do you propose your code changes? Empirical analysis of affect metrics of pull requests on GitHub

    Get PDF
    Software engineering methodologies rely on version control systems such as git to store source code artifacts and manage changes to the codebase. Pull requests include chunks of source code, history of changes, log messages around a proposed change of the mainstream codebase, and much discussion on whether to integrate such changes or not. A better understanding of what contributes to a pull request fate and latency will allow us to build predictive models of what is going to happen and when. Several factors can influence the acceptance of pull requests, many of which are related to the individual aspects of software developers. In this study, we aim to understand how the affect (e.g., sentiment, discrete emotions, and valence-arousal-dominance dimensions) expressed in the discussion of pull request issues influence the acceptance of pull requests. We conducted a mining study of large git software repositories and analyzed more than 150,000 issues with more than 1,000,000 comments in them. We built a model to understand whether the affect and the politeness have an impact on the chance of issues and pull requests to be merged - i.e., the code which fixes the issue is integrated in the codebase. We built two logistic classifiers, one without affect metrics and one with them. By comparing the two classifiers, we show that the affect metrics improve the prediction performance. Our results show that valence (expressed in comments received and posted by a reporter) and joy expressed in the comments written by a reporter are linked to a higher likelihood of issues to be merged. On the contrary, sadness, anger, and arousal expressed in the comments written by a reporter, and anger, arousal, and dominance expressed in the comments received by a reporter, are linked to a lower likelihood of a pull request to be merged

    SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

    Full text link
    Sentiment analysis has various application scenarios in software engineering (SE), such as detecting developers' emotions in commit messages and identifying their opinions on Q&A forums. However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason. Then, researchers have to utilize labeled SE-related texts to customize sentiment analysis for SE tasks via a variety of algorithms. However, the scarce labeled data can cover only very limited expressions and thus cannot guarantee the analysis quality. To address such a problem, we turn to the easily available emoji usage data for help. More specifically, we employ emotional emojis as noisy labels of sentiments and propose a representation learning approach that uses both Tweets and GitHub posts containing emojis to learn sentiment-aware representations for SE-related texts. These emoji-labeled posts can not only supply the technical jargon, but also incorporate more general sentiment patterns shared across domains. They as well as labeled data are used to learn the final sentiment classifier. Compared to the existing sentiment analysis methods used in SE, the proposed approach can achieve significant improvement on representative benchmark datasets. By further contrast experiments, we find that the Tweets make a key contribution to the power of our approach. This finding informs future research not to unilaterally pursue the domain-specific resource, but try to transform knowledge from the open domain through ubiquitous signals such as emojis.Comment: Accepted by the 2019 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Please include ESEC/FSE in any citation

    Describing software developers affectiveness through Markov chain models

    Get PDF
    In this paper, we present an analysis of more than 500K comments from open-source repositories of software systems. Our aim is to empirically determine how developers interact with each other under certain psychological conditions generated by politeness, sentiment and emotion expressed within developers' comments. Developers involved in an open-source projects do not usually know each other; they mainly communicate through mailing lists, chat rooms, and tools such as issue tracking systems. The way in which they communicate affects the development process and the productivity of the people involved in the project. We evaluated politeness, sentiment and emotions of comments posted by developers and studied the communication flow to understand how they interacted in the presence of impolite and negative comments (and vice versa).Our analysis shows that when in presence of impolite or negative comments, the probability of the next comment being impolite or negative is 14% and 25%, respectively; anger however, has a probability of 40% of being followed by a further anger comment. The result could help managers take control the development phases of a system, since social aspects can seriously affect a developer's productivity. In a distributed environment this may have a particular resonance.Engineering and Physical Sciences Research Council (EPSRC

    Аналіз ризиків проекту за допомогою текстового інтелектуального аналізу даних коментарів в системі управління проектами jira​

    Get PDF
    Магістерська дисертація: 125 с., 21 рис., 29 табл., 5 додатки, 24 джерела. Об’єктом дослідження є проектні ризики. Предметом дослідження є методи аналізу проектних ризиків і коментарів в системі управління проектами jira. Мета дослідження: 1) дослідження питання ризиків проектів сфери ІТ та методів їх виявлення; 2) дослідження існуючих методів та алгоритмів для інтелектуального аналізу тексту на предмет тригерів ризиків; 3) розробка методології використання інтелектуального аналізу тексту для ідентифікації та аналізу ризиків проекту; 4) розробка ПЗ для проведення експериментів за даною методологією; 5) аналіз результатів та рекомендації щодо подальших досліджень. Теоретичною та методологічною основою дослідження є праці закордонних вчених в галузі управління проектів, управління ризиками проекту, інтелектуальної обробки текстових даних, сентиментального та емоційного аналізу текста, а також моделей для побудови тем та графічного представлення результатів. В ході дипломної роботи було розроблено методологію та створено програмний продукт для визначення ризиків проекту, базуючись на комунікації розробників, а також представлено результати роботи програми на даних реального проекту CASSANDRA компанії Apache Software Foundation. Методологія реалізована на основі вже відомих алгоритмів визначення емоційних складових у тексті VAD та матричних методів аналізу ризиків проекту з використанням власних розробок, що дозволяти з’єднати ці різні підходи. Програмний продукт реалізовано за допомогою мови програмування Python та пакетами для роботи з текстом gensim та spacy. У кінці роботи надано рекомендації до подальших досліджень.Thesis work: 125 pp., 21 fig., 29 tabl., 5 applications, 24 sources. The object of the research is project risks. It is planned to study the methodological calculations of project risks and comments on the project management system. The aim of the study: 1) research into the risks of IT projects and detection and detection methods; 2) the study of existing methods and algorithms for the intellectual analysis of the text on the subject of risk triggers; 3) development of a text mining methodology for project identification and risk analysis; 4) software development for conducting experiments on this methodology; 5) analysis of the results and recommendations for further research. The theoretical and methodological basis of the research is the work of scientists in the field of project management, project risk management, intellectual processing of textual data, and sentimental and emotional analysis of text. In the course of the thesis, a methodology was developed and a software product was created for project risk assessment based on developer communications, as well as the results of the program’s work on the data of the real project CASSANDRA of Apache Software Foundation were presented. The methodology is implemented on the basis of the already known algorithms for determining the emotional components in the VAD text and the matrix methods of project risk analysis using our own developments, allowing to combine these different approaches. The software product is implemented using the Python programming language and the framework for working with Apache Spark big data. Recommendations for further research are given
    corecore