315 research outputs found

    How do you propose your code changes? Empirical analysis of affect metrics of pull requests on GitHub

    Get PDF
    Software engineering methodologies rely on version control systems such as git to store source code artifacts and manage changes to the codebase. Pull requests include chunks of source code, history of changes, log messages around a proposed change of the mainstream codebase, and much discussion on whether to integrate such changes or not. A better understanding of what contributes to a pull request fate and latency will allow us to build predictive models of what is going to happen and when. Several factors can influence the acceptance of pull requests, many of which are related to the individual aspects of software developers. In this study, we aim to understand how the affect (e.g., sentiment, discrete emotions, and valence-arousal-dominance dimensions) expressed in the discussion of pull request issues influence the acceptance of pull requests. We conducted a mining study of large git software repositories and analyzed more than 150,000 issues with more than 1,000,000 comments in them. We built a model to understand whether the affect and the politeness have an impact on the chance of issues and pull requests to be merged - i.e., the code which fixes the issue is integrated in the codebase. We built two logistic classifiers, one without affect metrics and one with them. By comparing the two classifiers, we show that the affect metrics improve the prediction performance. Our results show that valence (expressed in comments received and posted by a reporter) and joy expressed in the comments written by a reporter are linked to a higher likelihood of issues to be merged. On the contrary, sadness, anger, and arousal expressed in the comments written by a reporter, and anger, arousal, and dominance expressed in the comments received by a reporter, are linked to a lower likelihood of a pull request to be merged

    Mining software repositories: measuring effectiveness and affectiveness in software systems.

    Get PDF
    Software Engineering field has many goals, among them we can certainly deal with monitoring and controlling the development process in order to meet the business requirements of the released software artifact. Software engineers need to have empirical evidence that the development process and the overall quality of software artifacts is converging to the required features. Improving the development process's Effectiveness leads to higher productivity, meaning shorter time to market, but understanding or even measuring the software de- velopment process is an hard challenge. Modern software is the result of a complex process involving many stakeholders such as product owners, quality assurance teams, project manager and, above all, developers. All these stake- holders use complex software systems for managing development process, issue tracking, code versioning, release scheduling and many other aspect concerning software development. Tools for project management and issues/bugs tracking are becoming useful for governing the development process of Open Source soft- ware. Such tools simplify the communications process among developers and ensure the scalability of a project. The more information developers are able to exchange, the clearer are the goals, and the higher is the number of developers keen on joining and actively collaborating on a project. By analyzing data stored in such systems, researchers are able to study and address questions such as: Which are the factors able to impact the software productivity? Is it possible to improve software productivity shortening the time to market?. The present work addresses two major aspect of software development pro- cess: Effectiveness and Affectiveness. By analyzing data stored in project man- agement and in issue tracking system of Open Source Communities, we mea- sured the Effectiveness as the time required to resolve an issue and analyzed factors able to impact it

    Mining software repositories: measuring effectiveness and affectiveness in software systems.

    Get PDF
    Software Engineering field has many goals, among them we can certainly deal with monitoring and controlling the development process in order to meet the business requirements of the released software artifact. Software engineers need to have empirical evidence that the development process and the overall quality of software artifacts is converging to the required features. Improving the development process's Effectiveness leads to higher productivity, meaning shorter time to market, but understanding or even measuring the software de- velopment process is an hard challenge. Modern software is the result of a complex process involving many stakeholders such as product owners, quality assurance teams, project manager and, above all, developers. All these stake- holders use complex software systems for managing development process, issue tracking, code versioning, release scheduling and many other aspect concerning software development. Tools for project management and issues/bugs tracking are becoming useful for governing the development process of Open Source soft- ware. Such tools simplify the communications process among developers and ensure the scalability of a project. The more information developers are able to exchange, the clearer are the goals, and the higher is the number of developers keen on joining and actively collaborating on a project. By analyzing data stored in such systems, researchers are able to study and address questions such as: Which are the factors able to impact the software productivity? Is it possible to improve software productivity shortening the time to market?. The present work addresses two major aspect of software development pro- cess: Effectiveness and Affectiveness. By analyzing data stored in project man- agement and in issue tracking system of Open Source Communities, we mea- sured the Effectiveness as the time required to resolve an issue and analyzed factors able to impact it

    How diverse is your team? Investigating gender and nationality diversity in GitHub teams

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Background Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a “dream” team involves many variables, including consideration of human factors and it is not a dilemma solvable in a mathematical way. Empirical studies might provide interesting insights to explain which factors need to be taken into account in building a team of developers and which levers act to optimise productivity among developers. Aim In this paper, we present the results of an empirical study aimed at investigating the link between team diversity (i.e., gender, nationality) and productivity (issue fixing time). Method We consider issues solved from the GHTorrent dataset inferring gender and nationality of each team’s members. We also evaluate the politeness of all comments involved in issue resolution. Results Results show that higher gender diversity is linked with a lower team average issue fixing time (higher productivity), that nationality diversity is linked with lower team politeness and that gender diversity is linked with higher sentiment.Peer reviewedFinal Published versio

    Software Process Evaluation from User Perceptions and Log Data

    Get PDF
    Companies often claim to follow specific software development methodologies (SDM) when performing their software development process. These methodologies are often supported by dedicated tools that keep track of work activities carried out by developers. The purpose of this paper is to provide a novel approach that integrates analytical insights from both the perceptions of SDM stakeholders and software development tools logs to provide SDM improvement recommendations. This paper develops a new process improvement approach that combines two significantly different sources of data on the same phenomenon. First, it uses a questionnaire to gather software development stakeholder SDM perceptions (managers and developers). Second, it leverages process mining to analyze software development tools logs to obtain additional information on software development activities. Finally, it develops recommendations based on concurrent analysis of both sources. Our novel process improvement approach is evaluated in three directions: Does the presented approach (RQ1) enable managers to gain additional insights into employees' performance, (RQ2) deliver additional insights into project performance, and (RQ3) enable development of additional SDM improvement recommendations? We find that integrated analysis of software development perception data and software development tools logs opens new possibilities to more precisely identify and improve specific SDM elements. The evaluation of our novel process improvement approach follows a single case study design. Our approach can only be used in enterprises in which software development tools logs are available. The study should be repeated in different cultural settings. We practically show how concurrently analyzing data about developer SDM perceptions and event log data from software development tools enables management to gain additional insights in the software development process regarding the performance of individual developers. The main theoretical contribution of our paper is a novel process improvement approach that effectively integrates data from management and developer perspectives and software development tools logs.Einstein Foundation Berlin http://dx.doi.org/10.13039/501100006188Peer Reviewe

    In Pursuit of Optimal Workflow Within The Apache Software Foundation

    Get PDF
    abstract: The following is a case study composed of three workflow investigations at the open source software development (OSSD) based Apache Software Foundation (Apache). I start with an examination of the workload inequality within the Apache, particularly with regard to requirements writing. I established that the stronger a participant's experience indicators are, the more likely they are to propose a requirement that is not a defect and the more likely the requirement is eventually implemented. Requirements at Apache are divided into work tickets (tickets). In our second investigation, I reported many insights into the distribution patterns of these tickets. The participants that create the tickets often had the best track records for determining who should participate in that ticket. Tickets that were at one point volunteered for (self-assigned) had a lower incident of neglect but in some cases were also associated with severe delay. When a participant claims a ticket but postpones the work involved, these tickets exist without a solution for five to ten times as long, depending on the circumstances. I make recommendations that may reduce the incidence of tickets that are claimed but not implemented in a timely manner. After giving an in-depth explanation of how I obtained this data set through web crawlers, I describe the pattern mining platform I developed to make my data mining efforts highly scalable and repeatable. Lastly, I used process mining techniques to show that workflow patterns vary greatly within teams at Apache. I investigated a variety of process choices and how they might be influencing the outcomes of OSSD projects. I report a moderately negative association between how often a team updates the specifics of a requirement and how often requirements are completed. I also verified that the prevalence of volunteerism indicators is positively associated with work completion but what was surprising is that this correlation is stronger if I exclude the very large projects. I suggest the largest projects at Apache may benefit from some level of traditional delegation in addition to the phenomenon of volunteerism that OSSD is normally associated with.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201

    Impact of stakeholder type and collaboration on issue resolution time in OSS Projects

    Get PDF
    Born as a movement of collective contribution of volunteer developers, Open source software (OSS) attracts an increasing involvement of commercial firms. Many OSS projects are composed of a mix group of firm- paid and volunteer developers, with different motivation, collaboration practices and working styles. As OSS is collaborative work in nature, it is important to know whether these differences have an impact on project outcomes. In this paper, we empirically investigate the firm-paid participation in resolving OSS evolution issues, the stakeholder collaboration and its impact on OSS issue resolution time. The results suggest that though a firm-paid developer resolves much more issues than a volunteer developer does, there is no difference in issue resolution time between firm-paid and volunteer developers. Besides, the more important factor that influences the issue resolution time comes from the collaboration among stakeholders rather than from measures of individual characteristics.Peer ReviewedPostprint (author’s final draft

    Continuous Rationale Management

    Get PDF
    Continuous Software Engineering (CSE) is a software life cycle model open to frequent changes in requirements or technology. During CSE, software developers continuously make decisions on the requirements and design of the software or the development process. They establish essential decision knowledge, which they need to document and share so that it supports the evolution and changes of the software. The management of decision knowledge is called rationale management. Rationale management provides an opportunity to support the change process during CSE. However, rationale management is not well integrated into CSE. The overall goal of this dissertation is to provide workflows and tool support for continuous rationale management. The dissertation contributes an interview study with practitioners from the industry, which investigates rationale management problems, current practices, and features to support continuous rationale management beneficial for practitioners. Problems of rationale management in practice are threefold: First, documenting decision knowledge is intrusive in the development process and an additional effort. Second, the high amount of distributed decision knowledge documentation is difficult to access and use. Third, the documented knowledge can be of low quality, e.g., outdated, which impedes its use. The dissertation contributes a systematic mapping study on recommendation and classification approaches to treat the rationale management problems. The major contribution of this dissertation is a validated approach for continuous rationale management consisting of the ConRat life cycle model extension and the comprehensive ConDec tool support. To reduce intrusiveness and additional effort, ConRat integrates rationale management activities into existing workflows, such as requirements elicitation, development, and meetings. ConDec integrates into standard development tools instead of providing a separate tool. ConDec enables lightweight capturing and use of decision knowledge from various artifacts and reduces the developers' effort through automatic text classification, recommendation, and nudging mechanisms for rationale management. To enable access and use of distributed decision knowledge documentation, ConRat defines a knowledge model of decision knowledge and other artifacts. ConDec instantiates the model as a knowledge graph and offers interactive knowledge views with useful tailoring, e.g., transitive linking. To operationalize high quality, ConRat introduces the rationale backlog, the definition of done for knowledge documentation, and metrics for intra-rationale completeness and decision coverage of requirements and code. ConDec implements these agile concepts for rationale management and a knowledge dashboard. ConDec also supports consistent changes through change impact analysis. The dissertation shows the feasibility, effectiveness, and user acceptance of ConRat and ConDec in six case study projects in an industrial setting. Besides, it comprehensively analyses the rationale documentation created in the projects. The validation indicates that ConRat and ConDec benefit CSE projects. Based on the dissertation, continuous rationale management should become a standard part of CSE, like automated testing or continuous integration
    corecore