13 research outputs found

    Search-based composed refactorings

    Get PDF
    Refactorings are commonly applied to source code to improve itsstructure and maintainability. Integrated development environments(IDEs) such as Eclipse or NetBeans offer refactoring support for variousprogramming languages. Usually, the developer makes a particularselection in the source code, and chooses to apply one of the refactorings,which is then executed (with suitable pre-condition checks) by the IDE.Here, we study how we can reuse two existing refactorings toimplement a more complex refactoring, and use heuristics to derivesuitable input arguments for the new refactoring. We show that ourcombination of the Extract Method and Move Method refactoring canautomatically improve the code quality on a large Java code base

    SapFix: Automated End-To-End Repair at Scale

    Get PDF
    We report our experience with SapFix: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SapFix at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide

    30 Years of Software Refactoring Research: A Systematic Literature Review

    Full text link
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd

    Large-scale semi-automated migration of legacy C/C++ test code

    Get PDF
    This is an industrial experience report on a large semi-automated migration of legacy test code in C and C++. The particular migration was enabled by automating most of the maintenance steps. Without automation this particular large-scale migration would not have been conducted, due to the risks involved in manual maintenance (risk of introducing errors, risk of unexpected rework, and loss of productivity). We describe and evaluate the method of automation we used on this real-world case. The benefits were that by automating analysis, we could make sure that we understand all the relevant details for the envisioned maintenance, without having to manually read and check our theories. Furthermore, by automating transformations we could reiterate and improve over complex and large scale source code updates, until they were “just right.” The drawbacks were that, first, we have had to learn new metaprogramming skills. Second, our automation scripts are not readily reusable for other contexts; they were necessarily developed for this ad-hoc maintenance task. Our analysis shows that automated software maintenance as compared to the (hypothetical) manual alternative method seems to be better both in terms of avoiding mistakes and avoiding rework because of such mistakes. It seems that necessary and beneficial source code maintenance need not to be avoided, if software engineers are enabled to create bespoke (and ad-hoc) analysis and transformation tools to support it

    30 Years of Software Refactoring Research:A Systematic Literature Review

    Full text link
    Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research.Comment: 23 page

    Improving regression testing efficiency and reliability via test-suite transformations

    Get PDF
    As software becomes more important and ubiquitous, high quality software also becomes crucial. Developers constantly make changes to improve software, and they rely on regression testing—the process of running tests after every change—to ensure that changes do not break existing functionality. Regression testing is widely used both in industry and in open source, but it suffers from two main challenges. (1) Regression testing is costly. Developers run a large number of tests in the test suite after every change, and changes happen very frequently. The cost is both in the time developers spend waiting for the tests to finish running so that developers know whether the changes break existing functionality, and in the monetary cost of running the tests on machines. (2) Regression test suites contain flaky tests, which nondeterministically pass or fail when run on the same version of code, regardless of any changes. Flaky test failures can mislead developers into believing that their changes break existing functionality, even though those tests can fail without any changes. Developers will therefore waste time trying to debug non existent faults in their changes. This dissertation proposes three lines of work that address these challenges of regression testing through test-suite transformations that modify test suites to make them more efficient or more reliable. Specifically, two lines of work explore how to reduce the cost of regression testing and one line of work explores how to fix existing flaky tests. First, this dissertation investigates the effectiveness of test-suite reduction (TSR), a traditional test-suite transformation that removes tests deemed redundant with respect to other tests in the test suite based on heuristics. TSR outputs a smaller, reduced test suite to be run in the future. However, TSR risks removing tests that can potentially detect faults in future changes. While TSR was proposed over two decades ago, it was always evaluated using program versions with seeded faults. Such evaluations do not precisely predict the effectiveness of the reduced test suite on the future changes. This dissertation evaluates TSR in a real-world setting using real software evolution with real test failures. The results show that TSR techniques proposed in the past are not as effective as suggested by traditional TSR metrics, and those same metrics do not predict how effective a reduced test suite is in the future. Researchers need to either propose new TSR techniques that produce more effective reduced test suites or better metrics for predicting the effectiveness of reduced test suites. Second, this dissertation proposes a new transformation to improve regression testing cost when using a modern build system by optimizing the placement of tests, implemented in a technique called TestOptimizer. Modern build systems treat a software project as a group of inter-dependent modules, including test modules that contain only tests. As such, when developers make a change, the build system can use a developer-specified dependency graph among modules to determine which test modules are affected by any changed modules and to run only tests in the affected test modules. However, wasteful test executions are a problem when using build systems this way. Suboptimal placements of tests, where developers may place some tests in a module that has more dependencies than the test actually needs, lead to running more tests than necessary after a change. TestOptimizer analyzes a project and proposes moving tests to reduce the number of test executions that are triggered over time due to developer changes. Evaluation of TestOptimizer on five large proprietary projects at Microsoft shows that the suggested test movements can reduce 21.7 million test executions (17.1%) across all evaluation projects. Developers accepted and intend to implement 84.4% of the reported suggestions. Third, to make regression testing more reliable, this dissertation proposes iFixFlakies, a framework for fixing a prominent kind of flaky tests: order dependent tests. Order-dependent tests pass or fail depending on the order in which the tests are run. Intuitively, order-dependent tests fail either because they need another test to set up the state for them to pass, or because some other test pollutes the state before they are run, and the polluted state makes them fail. The key insight behind iFixFlakies is that test suites often already have tests, which we call helpers, that contain the logic for setting/resetting the state needed for order-dependent tests to pass. iFixFlakies searches a test suite for these helpers and then recommends patches for order-dependent tests using code from the helpers. Evaluation of iFixFlakies on 137 truly order-dependent tests from a public dataset shows that 81 of them have helpers, and iFixFlakies can fix all 81. Furthermore, among our GitHub pull requests for 78 of these order dependent tests (3 of 81 had been already fixed), developers accepted 38; the remaining ones are still pending, and none are rejected so far

    Automated Software Transplantation

    Get PDF
    Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: ”SCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours

    Explainable, Security-Aware and Dependency-Aware Framework for Intelligent Software Refactoring

    Full text link
    As software systems continue to grow in size and complexity, their maintenance continues to become more challenging and costly. Even for the most technologically sophisticated and competent organizations, building and maintaining high-performing software applications with high-quality-code is an extremely challenging and expensive endeavor. Software Refactoring is widely recognized as the key component for maintaining high-quality software by restructuring existing code and reducing technical debt. However, refactoring is difficult to achieve and often neglected due to several limitations in the existing refactoring techniques that reduce their effectiveness. These limitation include, but not limited to, detecting refactoring opportunities, recommending specific refactoring activities, and explaining the recommended changes. Existing techniques are mainly focused on the use of quality metrics such as coupling, cohesion, and the Quality Metrics for Object Oriented Design (QMOOD). However, there are many other factors identified in this work to assist and facilitate different maintenance activities for developers: 1. To structure the refactoring field and existing research results, this dissertation provides the most scalable and comprehensive systematic literature review analyzing the results of 3183 research papers on refactoring covering the last three decades. Based on this survey, we created a taxonomy to classify the existing research, identified research trends and highlighted gaps in the literature for further research. 2. To draw attention to what should be the current refactoring research focus from the developers’ perspective, we carried out the first large scale refactoring study on the most popular online Q&A forum for developers, Stack Overflow. We collected and analyzed posts to identify what developers ask about refactoring, the challenges that practitioners face when refactoring software systems, and what should be the current refactoring research focus from the developers’ perspective. 3. To improve the detection of refactoring opportunities in terms of quality and security in the context of mobile apps, we designed a framework that recommends the files to be refactored based on user reviews. We also considered the detection of refactoring opportunities in the context of web services. We proposed a machine learning-based approach that helps service providers and subscribers predict the quality of service with the least costs. Furthermore, to help developers make an accurate assessment of the quality of their software systems and decide if the code should be refactored, we propose a clustering-based approach to automatically identify the preferred benchmark to use for the quality assessment of a project. 4. Regarding the refactoring generation process, we proposed different techniques to enhance the change operators and seeding mechanism by using the history of applied refactorings and incorporating refactoring dependencies in order to improve the quality of the refactoring solutions. We also introduced the security aspect when generating refactoring recommendations, by investigating the possible impact of improving different quality attributes on a set of security metrics and finding the best trade-off between them. In another approach, we recommend refactorings to prioritize fixing quality issues in security-critical files, improve quality attributes and remove code smells. All the above contributions were validated at the large scale on thousands of open source and industry projects in collaboration with industry partners and the open source community. The contributions of this dissertation are integrated in a cloud-based refactoring framework which is currently used by practitioners.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/171082/1/Chaima Abid Final Dissertation.pdfDescription of Chaima Abid Final Dissertation.pdf : Dissertatio

    Intégration dans un flot de conception systÚme d'un outil de traduction assistée de code C pour la création de coprocesseurs matériels

    Get PDF
    RÉSUMÉ Depuis les dĂ©buts de la conception de systĂšmes Ă©lectroniques, un des buts de la recherche est de fournir des outils et mĂ©thodes de travail de plus en plus puissants et rapides. Ceci est illustrĂ© trĂšs explicitement dans le fossĂ© qui sĂ©pare la croissance du nombre de transistors par puce de la croissance du nombre de transistors intĂ©grĂ©s par ingĂ©nieur par mois. Au cours des annĂ©es se sont alors dĂ©veloppĂ©es des mĂ©thodes permettant Ă  un concepteur d’abstraire de plus en plus son travail afin de faciliter l’intĂ©gration et la distribution de composantes matĂ©rielles. À ces abstractions de description viennent s’ajouter des outils automatisant certaines parties du travail d’un concepteur. C’est par exemple le cas de la synthĂšse haut-niveau, qui permet de synthĂ©tiser des algorithmes de maniĂšre guidĂ©e. Enfin, de nombreux flots de travail ont aussi vu le jour afin de fournir une mĂ©thodologie dans la conception de systĂšmes Ă©lectroniques. Le flot que nous retiendrons est le flot niveau systĂšme Ă©lectronique (ESL) qui dĂ©crit une mĂ©thode de dĂ©veloppement incrĂ©mentale partant d’une description la plus abstraite possible. Les abstractions sont itĂ©rativement relĂąchĂ©es pour finalement arriver Ă  une description matĂ©rielle. Toutefois, il n’existe pas d’outil complet intĂ©grant toutes les Ă©tapes du flot ESL. Nous proposons donc un outil de traduction assistĂ©e de code C afin de complĂ©ter un flot de conception ESL dans le cadre de la crĂ©ation de coprocesseurs matĂ©riels. L’outil proposĂ©, C2Space, permet la transition d’un code C sĂ©quentiel vers un code organisĂ© en modules SpaceStudio de maniĂšre rapide et configurable. Cet outil de traduction, basĂ© sur le transpilateur Clang, procĂšde Ă  l’analyse statique du code pour effectuer les choix de traduction appropriĂ©s. C2Space permet l’intĂ©gration en amont du flot d’une solution d’analyse dynamique de code C, appelĂ©e Pareon Profile. Pareon permet une premiĂšre analyse du code sĂ©quentiel afin de repĂ©rer rapidement les portions du code susceptibles de profiter d’un passage sur coprocesseur (de par leur potentiel de parallĂ©lisme et leur poids dans l’exĂ©cution sĂ©quentielle). Une fois la traduction effectuĂ©e, le reste du flot est supportĂ© par l’environnement de codĂ©veloppement SpaceStudio, qui permet alors de procĂ©der Ă  l’exploration architecturale du code traduit. Il est alors possible de tester diffĂ©rents schĂ©mas d’allocation des modules aux ressources matĂ©rielles (processeurs, coprocesseurs). Les modules allouĂ©s en matĂ©riel sont alors synthĂ©tisĂ©s Ă  l’aide d’un logiciel de synthĂšse haut-niveau. Le systĂšme complet peut ensuite ĂȘtre exportĂ© vers une des plateformes matĂ©rielles supportĂ©es (par exemple, pour FPGA Xilinx). Nous avons confrontĂ© ce flot Ă  un algorithme de dĂ©tection de contours (filtre de Canny) afin d’en tester l’efficacitĂ©. Les rĂ©sultats montrent que l’approche proposĂ©e permet un raffinement rapide de l’algorithme, de sa dĂ©finition logique jusqu’à son implĂ©mentation. Nous n’avons pu accĂ©lĂ©rer, comme envisagĂ© initialement, la vitesse d’exĂ©cution de l’algorithme, mais nous avançons plusieurs pistes pour expliquer ces rĂ©sultats. Nous proposons une sĂ©rie de recommandations visant Ă  amĂ©liorer Ă  la fois C2Space et le flot proposĂ©. Nous dĂ©crivons notamment l’utilisation d’une nouvelle mĂ©trique trĂšs tĂŽt dans le flot de conception mettant en relation (1) le potentiel de parallĂ©lisme d’un segment de code, (2) la portion du temps d’exĂ©cution global de ce segment, (3) le coĂ»t de communications processeur/coprocesseur pour ce segment et (4) le temps d’exĂ©cution logiciel de cette mĂȘme portion de code.----------ABSTRACT Since the beginning of Electronic System Design, one of the main goals pursued in research has been to provide smarter and faster tools and workflows. This is clearly shown in the gap that separates the number of transistor per chip and the number of transistor that can be integrated by an engineer in a month. Over the years, many methods allowing a designer to abstract more and more his design came to fruition. In addition to those methods, automatization of certain tedious and repetitive tasks in the design process appeared. For instance, the tools of High Level Synthesis allow a high level specification to be automatically translated into a working Register Transfer level design. Finally, numerous workflows arose to provide more definite framework in Electronic System Design. The workflow that we will tackle is the Electronic System Level Flow (ESL), which describe an incremental method that starts from the most abstract specification possible. Abstraction is then released in small increments until a fully hardware specification is obtained. However, there is no comprehensive tool incorporating all stages of the ESL flow. We therefore propose an assisted translation tool for C code in order to complete an ESL design flow as part of creating hardware coprocessors. The proposed tool, C2Space, allows the transition of a sequential C code to a code organized in SpaceStudio modules in a fast and configurable way. This translation tool, based on the transpiler Clang, performs static analysis of the code to perform the appropriate translation choices. C2Space allows the integration, upstream of the flow, of a dynamic code analysis solution for C, called Pareon Profile. Pareon allows a first analysis of the sequential code to quickly identify sections of code that could benefit from a passage on coprocessor (by their potential parallelism and their weight in the sequential execution). Once the translation is done, the rest of the workflow is supported by the co-design environment SpaceStudio, which allows for architectural exploration of the translated code. It is then possible to test different mappings of the modules to the hardware resources (processors, coprocessors). Modules that are allocated material are then synthesized using a high-level synthesis software. The complete system can then be exported to one of the supported hardware platforms (e.g., for Xilinx FPGAs). We compared this flow to an edge detection algorithm (Canny filter) to test its effectiveness. The results show that the proposed approach enables rapid refinement of the algorithm, from the logic definition to its implementation. We have not been able to accelerate, as originally envisioned, the execution speed of the algorithm but we are offering several possible explanations for these results. We provide a series of recommendations to improve both C2Space and the proposed workflow. We describe in particular the use of a new metric early in the design flow linking (1) the potential parallelism of a code segment, (2) the portion of the total execution time of this segment, (3) the cost of communications between processor and coprocessors for this segment and (4) the software runtime of that same piece of code)
    corecore