9 research outputs found

    Variance of source code quality change caused by version control operations

    Get PDF
    Software maintenance consumes huge efforts. Its cost strongly depends on the quality of the source code: an easy-to-maintain code needs much less effort than the maintenance of a more problematic one. Based on experiences, the maintainability of the source code tends to decrease during its lifetime. However, in most of the cases, this decrease is not a smooth linear one, but there are larger and smaller ups and downs, and the net root of these changes generally results in a negative tendency. Detecting common development patterns which similarly influence the maintainability could help to stop or even turn back source code erosion. In this research the scale of the ups and downs are investigated, namely that which version control operations cause bigger and which smaller changes in the maintainability. We calculated the maintainability and collected the cardinality of each version control operation for every revision of four inspected software systems. With the help of these data we checked which version control operation causes higher absolute code quality changes and which lower. We found clear connection between version control operations and the variance of the maintainability changes. File Additions and file Deletions caused significantly higher variance in maintainability changes compared to file Updates. Commits containing higher number of operations - regardless of the type of the operation - caused higher variance in maintainability changes than those commits containing lower number of operations. As a practical conclusion, it is recommended to pay special attention to the quality of commits containing new file additions, e.g. with the help of a mandatory code review

    Refining code ownership with synchronous changes

    Get PDF
    When mining software repositories, two distinct sources of information are usually explored: the history log and snapshots of the system. Results of analyses derived from these two sources are biased by the frequency with which developers commit their changes. We argue that the usage of mainstream SCM (software configuration management) systems influences the way that developers work. For example, since it is tedious to resolve conflicts due to parallel commits, developers tend to minimize conflicts by not contemporarily modifying the same file. This however defeats one of the purposes of such systems. We mine repositories created by our tool Syde, which records changes in a central repository whenever a file is compiled locally in the IDE (integrated development environment) by any developer in a multi-developer project. This new source of information can augment the accuracy of analyses and breaks new ground in terms of how such information can assist developers. We illustrate how the information we mine provides a refined notion of code ownership with respect to the one inferred by SCM system data. We demonstrate our approach on three case studies, including an industrial one. Ownership models suffer from the assumption that developers have a perfect memory. To account for their imperfect memory, we integrate into our ownership measurement a model of memory retention, to simulate the effect of memory loss over time. We evaluate the characteristics of this model for several strengths of memor

    What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications

    Get PDF
    The widespread adoption of mobile devices, coupled with the ease of developing mobile-based applications (apps) has created a lucrative and competitive environment for app developers. Solely focusing on app functionality and time-to-market is not enough for developers to ensure the success of their app. Quality attributes exhibited by the app must also be a key focus point; not just at the onset of app development, but throughout its lifetime. The impact analysis of bad programming practices, or code smells, in production code has been the focus of numerous studies in software maintenance. Similar to production code, unit tests are also susceptible to bad programming practices which can have a negative impact not only on the quality of the software system but also on maintenance activities. With the present corpus of studies on test smells primarily on traditional applications, there is a need to fill the void in understanding the deviation of testing guidelines in the mobile environment. Furthermore, there is a need to understand the degree to which test smells are prevalent in mobile apps and the impact of such smells on app maintenance. Hence, the purpose of this research is to: (1) extend the existing set of bad test-code practices by introducing new test smells, (2) provide the software engineering community with an open-source test smell detection tool, and (3) perform a large-scale empirical study on test smell occurrence, distribution, and impact on the maintenance of open-source Android apps. Through multiple experiments, our findings indicate that most Android apps lack an automated verification of their testing mechanisms. As for the apps with existing test suites, they exhibit test smells early on in their lifetime with varying degrees of co-occurrences with different smell types. Our exploration of the relationship between test smells and technical debt proves that test smells are a strong measurement of technical debt. Furthermore, we observed positive correlations between specific smell types and highly changed/buggy test files. Hence, this research demonstrates that test smells can be used as indicators for necessary preventive software maintenance for test suites

    Pitfalls and Guidelines for Using Time-Based Git Data

    Get PDF
    Many software engineering research papers rely on time-based data (e.g., commit timestamps, issue report creation/update/close dates, release dates). Like most real-world data however, time-based data is often dirty. To date, there are no studies that quantify how frequently such data is used by the software engineering research community, or investigate sources of and quantify how often such data is dirty. Depending on the research task and method used, including such dirty data could aect the research results. This paper presents an extended survey of papers that utilize time-based data, published in the Mining Software Repositories (MSR) conference series. Out of the 754 technical track and data papers published in MSR 2004{2021, we saw at least 290 (38%) papers utilized time-based data. We also observed that most time-based data used in research papers comes in the form of Git commits, often from GitHub. Based on those results, we then used the Boa and Software Heritage infrastructures to help identify and quantify several sources of dirty Git timestamp data. Finally we provide guidelines/best practices for researchers utilizing time-based data from Git repositories

    Change-centric improvement of team collaboration

    Get PDF
    In software development, teamwork is essential to the successful delivery of a final product. The software industry has historically built software utilizing development teams that share the workplace. Process models, tools, and methodologies have been enhanced to support the development of software in a collocated setting. However, since the dawn of the 21st century, this scenario has begun to change: an increasing number of software companies are adopting global software development to cut costs and speed up the development process. Global software development introduces several challenges for the creation of quality software, from the adaptation of current methods, tools, techniques, etc., to new challenges imposed by the distributed setting, including physical and cultural distance between teams, communication problems, and coordination breakdowns. A particular challenge for distributed teams is the maintenance of a level of collaboration naturally present in collocated teams. Collaboration in this situation naturally d r ops due to low awareness of the activity of the team. Awareness is intrinsic to a collocated team, being obtained through human interaction such as informal conversation or meetings. For a distributed team, however, geographical distance and a subsequent lack of human interaction negatively impact this awareness. This dissertation focuses on the improvement of collaboration, especially within geographically dispersed teams. Our thesis is that by modeling the evolution of a software system in terms of fine-grained changes, we can produce a detailed history that may be leveraged to help developers collaborate. To validate this claim, we first c r eate a model to accurately represent the evolution of a system as sequences of fine- grained changes. We proceed to build a tool infrastructure able to capture and store fine-grained changes for both immediate and later use. Upon this foundation, we devise and evaluate a number of applications for our work with two distinct goals: 1. To assist developers with real-time information about the activity of the team. These applications aim to improve developers’ awareness of team member activity that can impact their work. We propose visualizations to notify developers of ongoing change activity, as well as a new technique for detecting and informing developers about potential emerging conflicts. 2. To help developers satisfy their needs for information related to the evolution of the software system. These applications aim to exploit the detailed change history generated by our approach in order to help developers find answers to questions arising during their work. To this end, we present two new measurements of code expertise, and a novel approach to replaying past changes according to user-defined criteria. We evaluate the approach and applications by adopting appropriate empirical methods for each case. A total of two case studies – one controlled experiment, and one qualitative user study – are reported. The results provide evidence that applications leveraging a fine-grained change history of a software system can effectively help developers collaborate in a distributed setting

    Aide à l'Intégration de Branches Grâce à la Réification des Changements

    Get PDF
    Developers typically change codebases in parallel from each other, which results in diverging codebases. Such diverging codebases must be integrated when finished. Integrating diverging codebases involves difficult activities. For example, two changes that are correct independently can introduce subtle bugs when integrated together. Integration can be difficult with existing tools, which, instead of dealing with the evolution of the actual program entities being changed, handle code changes as lines of text in files. Tools are important: software development tools have greatly improved from generic text editors to IDEs by providing high-level code manipulation such as automatic refactorings and code completion. This improvement was possible by the reification of program entities. Nevertheless, integration tools did not benefit from a similarreification of change entities to improve productivity in integration.In this work we first conducted a study to learn which integration activities are important and have little tool support. We discovered that one of such activities is the detection of tangled commits (that contain unrelated tasks such as a bug fix and a refactoring). Then we proposed Epicea, a reified change model and associated IDE tools, and EpiceaUntangler, an approach to help developers share untangled commits based on Epicea. The results of our evaluations with real-world studies show the usefulness of our approaches.Les développeurs changent le code source en parallèle les uns des autres, ce qui fait diverger les bases de code. Ces divergences se doivent d'être réintégrées.L'intégration de bases de code divergentes est une activité complexe. Par exemple, réunir deux bases de code indépendamment correctes peut générer des problèmes. L'intégration peut être difficile avec les outils existants, qui, au lieu de gérer l'évolution des entités réelles du programme modifié, gère les changements de code au niveau des lignes de texte dans les fichiers sources.Les outils sont importants: les outils de développement de logiciels se sont grandement améliorés en passant par exemple d'éditeurs de textegénériques à des IDEs qui fournissent de la manipulation de code de haut niveau tels que la refactorisation automatique et la complétion de code. Cette amélioration a été possible grâce à la réification des entités de programme. Néanmoins, les outils d'intégration n'ont pas profité d'une réification similaire des entités de changement pour améliorer l'intégration.Dans cette thèse nous avons d'abord conduit une étude auprès de développeurs pourcomprendre quelles sont les activités menées durant une intégration quisont peu supportées par les outils. L'une d'elle est la détection de commits mêlés (qui contiennent des tâches non liées telles qu'une correction de bug et une refactorisation).Ensuite, nous proposons Epicea, un modèle de changement réifié et des outils d'IDE associés, et EpiceaUntangler, une approche pour aider les développeurs à démêler les commits en se basant sur Epicea.Les résultats de nos évaluations avec des études de cas issues du monde réel montrent l’utilité de nos approches

    Mining a Change-Based Software Repository

    No full text
    corecore