4 research outputs found

    Improving software engineering processes using machine learning and data mining techniques

    Get PDF
    The availability of large amounts of data from software development has created an area of research called mining software repositories. Researchers mine data from software repositories both to improve understanding of software development and evolution, and to empirically validate novel ideas and techniques. The large amount of data collected from software processes can then be leveraged for machine learning applications. Indeed, machine learning can have a large impact in software engineering, just like it has had in other fields, supporting developers, and other actors involved in the software development process, in automating or improving parts of their work. The automation can not only make some phases of the development process less tedious or cheaper, but also more efficient and less prone to errors. Moreover, employing machine learning can reduce the complexity of difficult problems, enabling engineers to focus on more interesting problems rather than the basics of development. The aim of this dissertation is to show how the development and the use of machine learning and data mining techniques can support several software engineering phases, ranging from crash handling, to code review, to patch uplifting, to software ecosystem management. To validate our thesis we conducted several studies tackling different problems in an industrial open-source context, focusing on the case of Mozilla

    Understanding the Impact of Release Processes and Practices on Software Quality

    Get PDF
    L’ingénierie de production (release engineering) englobe toutes les activités visant à «construire un pipeline qui transforme le code source en un produit intégré, compilé, empaqueté, testé et signé prêt à être publier». La stratégie des production et les pratiques de publication peuvent avoir un impact sur la qualité d’un produit logiciel. Bien que cet impact ait été longuement discuté et étudié dans la communauté du génie logiciel, il reste encore plusieurs problèmes à résoudre. Cette thèse s’attaque à quelque-uns de ces problèmes non résoulus de l’ingénierie de production en vue de proposer des solutions. En particulier, nous investigons : 1) pourquoi les activités de révision de code (code review) peuvent rater des erreurs de code susceptibles de causer des plantages (crashs); (2) comment prévenir les bogues lors de l’approbation et l’intégration des patches urgents; 3) dans un écosystème logiciel, comment atténuer le risque de bogues dus à des injections de DLL. Nous avons choisi d’étudier ces problèmes car ils correspondent à trois phases importantes des processus de production de logiciels, c’est-à-dire la révision de code, les patches urgents, et la publication de logiciels dans un écosystème. Les solutions à ces problèmes peuvent aider les entreprises de logiciels à améliorer leur stratégie de production et de publication. Ce qui augmentera leur productivité de développement et la qualité générale de leurs produits logiciels.----------ABSTRACT: Release engineering encompasses all the activities aimed at “building a pipeline that transforms source code into an integrated, compiled, packaged, tested, and signed product that is ready for release”. The strategy of the release processes and practices can impact the quality of a software artefact. Although such impact has been extensively discussed and studied in the software engineering community, there are still many pending issues to resolve. The goal of this thesis is to study and solve some of these pending issues. More specifically, we examine 1) why code review practices can miss crash-prone code; 2) how urgent patches (also called patch uplift) are approved to release and how to prevent regressions due to urgent patches; 3) in a software ecosystem, how to mitigate the risk of defects due to DLL injections. We chose to study these problems because they correspond to three important phases of software release processes, i.e., code review, patch uplift, and releasing software in an ecosystem. The solutions of these problems can help software organizations improve their release strategy; increasing their development productivity and the overall user-perceived quality of their products

    Automated Software Transplantation

    Get PDF
    Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: µSCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours

    Estado de la práctica de la Ingeniería de Requisitos en proyectos de Software Open Source

    Get PDF
    El objetivo de este trabajo es realizar un estudio exploratorio sobre los procesos utilizados por las comunidades OSS para la gestión de requisitos con el fin de obtener una base de conocimientos para la sólida planificación de un estudio internacional a gran escala que se llevará a cabo posteriormente. Así pues, el objetivo de esta tesis de máster es: “Comprender y caracterizar los procesos e infraestructuras utilizadas por comunidades OSS para la elicitación y gestión de requisitos”
    corecore