Search CORE

13 research outputs found

Search-based composed refactorings

Author: Kristiansen Erlend
Stolz Volker
Publication venue: NIKT Foundation
Publication date: 27/10/2014
Field of study

Refactorings are commonly applied to source code to improve itsstructure and maintainability. Integrated development environments(IDEs) such as Eclipse or NetBeans offer refactoring support for variousprogramming languages. Usually, the developer makes a particularselection in the source code, and chooses to apply one of the refactorings,which is then executed (with suitable pre-condition checks) by the IDE.Here, we study how we can reuse two existing refactorings toimplement a more complex refactoring, and use heuristics to derivesuitable input arguments for the new refactoring. We show that ourcombination of the Extract Method and Move Method refactoring canautomatically improve the code quality on a large Java code base

BIBSYS: Open Journals Systems

SapFix: Automated End-To-End Repair at Scale

Author: Bader J
Chandra S
Harman M
Jia Y
Mao K
Marginean A
Mols A
Scott A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2019
Field of study

We report our experience with SapFix: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SapFix at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide

Crossref

UCL Discovery

30 Years of Software Refactoring Research: A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Ferreira Thiago
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

Towards Automating Type Changes

Author: Ketkar Ameya S.
Publication venue: 'Oregon State University'
Publication date
Field of study

Developers frequently change the type of a program element and update all its references for performance, security, concurrency, library migration, or better maintainability. Despite type changes being a common program transformation, it is the least automated and the least studied. Manually performing type changes is tedious since the programmers have to reason about propagating the type constraints of the new type over assignments, method hierarchies and subtypes. Existing automation approaches for type changes are inadequate for large-scale code bases. Moreover these techniques require the user to encode the adaptations for the type change in a DSL, which might not be straight-forward if she is unfamiliar with the types. These challenges introduce barriers in the adoption of these techniques. The thesis of this dissertation is: it is possible to design and im- plement type change techniques and tools that are scalable and usable. For this purpose, we developed: (i) TypeChangeMiner: a tool that accurately and eciently detects type changes performed in the version history of a project (ii) T2R: a ultra-scalable MapReduce amenable framework to perform type changes safely and accurately, and (iii) TC-Infer: a technique that learns the task of performing type changes by analyzing how other developers have previously performed the same type change

ScholarsArchive@OSU

Large-scale semi-automated migration of legacy C/C++ test code

Author: Aarssen R.T.A. (Rodin)
Schuts M.T.W. (Mathijs)
Tielemans P.M. (Paul)
Vinju J.J. (Jurgen)
Publication venue: 'Wiley'
Publication date: 22/03/2022
Field of study

This is an industrial experience report on a large semi-automated migration of legacy test code in C and C++. The particular migration was enabled by automating most of the maintenance steps. Without automation this particular large-scale migration would not have been conducted, due to the risks involved in manual maintenance (risk of introducing errors, risk of unexpected rework, and loss of productivity). We describe and evaluate the method of automation we used on this real-world case. The benefits were that by automating analysis, we could make sure that we understand all the relevant details for the envisioned maintenance, without having to manually read and check our theories. Furthermore, by automating transformations we could reiterate and improve over complex and large scale source code updates, until they were “just right.” The drawbacks were that, first, we have had to learn new metaprogramming skills. Second, our automation scripts are not readily reusable for other contexts; they were necessarily developed for this ad-hoc maintenance task. Our analysis shows that automated software maintenance as compared to the (hypothetical) manual alternative method seems to be better both in terms of avoiding mistakes and avoiding rework because of such mistakes. It seems that necessary and beneficial source code maintenance need not to be avoided, if software engineers are enabled to create bespoke (and ad-hoc) analysis and transformation tools to support it

CWI's Institutional Repository

30 Years of Software Refactoring Research:A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Dig Danny
Ferreira Thiago do Nascimento
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research.Comment: 23 page

arXiv.org e-Print Archive

Deep Blue Documents at the University of Michigan

Improving regression testing efficiency and reliability via test-suite transformations

Author: Shi August Wang
Publication venue
Publication date: 01/08/2020
Field of study

As software becomes more important and ubiquitous, high quality software also becomes crucial. Developers constantly make changes to improve software, and they rely on regression testing—the process of running tests after every change—to ensure that changes do not break existing functionality. Regression testing is widely used both in industry and in open source, but it suffers from two main challenges. (1) Regression testing is costly. Developers run a large number of tests in the test suite after every change, and changes happen very frequently. The cost is both in the time developers spend waiting for the tests to finish running so that developers know whether the changes break existing functionality, and in the monetary cost of running the tests on machines. (2) Regression test suites contain flaky tests, which nondeterministically pass or fail when run on the same version of code, regardless of any changes. Flaky test failures can mislead developers into believing that their changes break existing functionality, even though those tests can fail without any changes. Developers will therefore waste time trying to debug non existent faults in their changes. This dissertation proposes three lines of work that address these challenges of regression testing through test-suite transformations that modify test suites to make them more efficient or more reliable. Specifically, two lines of work explore how to reduce the cost of regression testing and one line of work explores how to fix existing flaky tests. First, this dissertation investigates the effectiveness of test-suite reduction (TSR), a traditional test-suite transformation that removes tests deemed redundant with respect to other tests in the test suite based on heuristics. TSR outputs a smaller, reduced test suite to be run in the future. However, TSR risks removing tests that can potentially detect faults in future changes. While TSR was proposed over two decades ago, it was always evaluated using program versions with seeded faults. Such evaluations do not precisely predict the effectiveness of the reduced test suite on the future changes. This dissertation evaluates TSR in a real-world setting using real software evolution with real test failures. The results show that TSR techniques proposed in the past are not as effective as suggested by traditional TSR metrics, and those same metrics do not predict how effective a reduced test suite is in the future. Researchers need to either propose new TSR techniques that produce more effective reduced test suites or better metrics for predicting the effectiveness of reduced test suites. Second, this dissertation proposes a new transformation to improve regression testing cost when using a modern build system by optimizing the placement of tests, implemented in a technique called TestOptimizer. Modern build systems treat a software project as a group of inter-dependent modules, including test modules that contain only tests. As such, when developers make a change, the build system can use a developer-specified dependency graph among modules to determine which test modules are affected by any changed modules and to run only tests in the affected test modules. However, wasteful test executions are a problem when using build systems this way. Suboptimal placements of tests, where developers may place some tests in a module that has more dependencies than the test actually needs, lead to running more tests than necessary after a change. TestOptimizer analyzes a project and proposes moving tests to reduce the number of test executions that are triggered over time due to developer changes. Evaluation of TestOptimizer on five large proprietary projects at Microsoft shows that the suggested test movements can reduce 21.7 million test executions (17.1%) across all evaluation projects. Developers accepted and intend to implement 84.4% of the reported suggestions. Third, to make regression testing more reliable, this dissertation proposes iFixFlakies, a framework for fixing a prominent kind of flaky tests: order dependent tests. Order-dependent tests pass or fail depending on the order in which the tests are run. Intuitively, order-dependent tests fail either because they need another test to set up the state for them to pass, or because some other test pollutes the state before they are run, and the polluted state makes them fail. The key insight behind iFixFlakies is that test suites often already have tests, which we call helpers, that contain the logic for setting/resetting the state needed for order-dependent tests to pass. iFixFlakies searches a test suite for these helpers and then recommends patches for order-dependent tests using code from the helpers. Evaluation of iFixFlakies on 137 truly order-dependent tests from a public dataset shows that 81 of them have helpers, and iFixFlakies can fix all 81. Furthermore, among our GitHub pull requests for 78 of these order dependent tests (3 of 81 had been already fixed), developers accepted 38; the remaining ones are still pending, and none are rejected so far

Illinois Digital Environment for Access to Learning and Scholarship Repository

Automated Software Transplantation

Author: Marginean A
Publication venue: UCL (University College London)
Publication date: 28/11/2021
Field of study

Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: µSCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours

UCL Discovery

Explainable, Security-Aware and Dependency-Aware Framework for Intelligent Software Refactoring

Author: Abid Chaima
Publication venue
Publication date: 30/11/2021
Field of study

As software systems continue to grow in size and complexity, their maintenance continues to become more challenging and costly. Even for the most technologically sophisticated and competent organizations, building and maintaining high-performing software applications with high-quality-code is an extremely challenging and expensive endeavor. Software Refactoring is widely recognized as the key component for maintaining high-quality software by restructuring existing code and reducing technical debt. However, refactoring is difficult to achieve and often neglected due to several limitations in the existing refactoring techniques that reduce their effectiveness. These limitation include, but not limited to, detecting refactoring opportunities, recommending specific refactoring activities, and explaining the recommended changes. Existing techniques are mainly focused on the use of quality metrics such as coupling, cohesion, and the Quality Metrics for Object Oriented Design (QMOOD). However, there are many other factors identified in this work to assist and facilitate different maintenance activities for developers: 1. To structure the refactoring field and existing research results, this dissertation provides the most scalable and comprehensive systematic literature review analyzing the results of 3183 research papers on refactoring covering the last three decades. Based on this survey, we created a taxonomy to classify the existing research, identified research trends and highlighted gaps in the literature for further research. 2. To draw attention to what should be the current refactoring research focus from the developers’ perspective, we carried out the first large scale refactoring study on the most popular online Q&A forum for developers, Stack Overflow. We collected and analyzed posts to identify what developers ask about refactoring, the challenges that practitioners face when refactoring software systems, and what should be the current refactoring research focus from the developers’ perspective. 3. To improve the detection of refactoring opportunities in terms of quality and security in the context of mobile apps, we designed a framework that recommends the files to be refactored based on user reviews. We also considered the detection of refactoring opportunities in the context of web services. We proposed a machine learning-based approach that helps service providers and subscribers predict the quality of service with the least costs. Furthermore, to help developers make an accurate assessment of the quality of their software systems and decide if the code should be refactored, we propose a clustering-based approach to automatically identify the preferred benchmark to use for the quality assessment of a project. 4. Regarding the refactoring generation process, we proposed different techniques to enhance the change operators and seeding mechanism by using the history of applied refactorings and incorporating refactoring dependencies in order to improve the quality of the refactoring solutions. We also introduced the security aspect when generating refactoring recommendations, by investigating the possible impact of improving different quality attributes on a set of security metrics and finding the best trade-off between them. In another approach, we recommend refactorings to prioritize fixing quality issues in security-critical files, improve quality attributes and remove code smells. All the above contributions were validated at the large scale on thousands of open source and industry projects in collaboration with industry partners and the open source community. The contributions of this dissertation are integrated in a cloud-based refactoring framework which is currently used by practitioners.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/171082/1/Chaima Abid Final Dissertation.pdfDescription of Chaima Abid Final Dissertation.pdf : Dissertatio

Deep Blue Documents at the University of Michigan

Intégration dans un flot de conception système d'un outil de traduction assistée de code C pour la création de coprocesseurs matériels

Author: Desaulty Arnaud
Publication venue
Publication date: 01/07/2016
Field of study

RÉSUMÉ Depuis les débuts de la conception de systèmes électroniques, un des buts de la recherche est de fournir des outils et méthodes de travail de plus en plus puissants et rapides. Ceci est illustré très explicitement dans le fossé qui sépare la croissance du nombre de transistors par puce de la croissance du nombre de transistors intégrés par ingénieur par mois. Au cours des années se sont alors développées des méthodes permettant à un concepteur d’abstraire de plus en plus son travail afin de faciliter l’intégration et la distribution de composantes matérielles. À ces abstractions de description viennent s’ajouter des outils automatisant certaines parties du travail d’un concepteur. C’est par exemple le cas de la synthèse haut-niveau, qui permet de synthétiser des algorithmes de manière guidée. Enfin, de nombreux flots de travail ont aussi vu le jour afin de fournir une méthodologie dans la conception de systèmes électroniques. Le flot que nous retiendrons est le flot niveau système électronique (ESL) qui décrit une méthode de développement incrémentale partant d’une description la plus abstraite possible. Les abstractions sont itérativement relâchées pour finalement arriver à une description matérielle. Toutefois, il n’existe pas d’outil complet intégrant toutes les étapes du flot ESL. Nous proposons donc un outil de traduction assistée de code C afin de compléter un flot de conception ESL dans le cadre de la création de coprocesseurs matériels. L’outil proposé, C2Space, permet la transition d’un code C séquentiel vers un code organisé en modules SpaceStudio de manière rapide et configurable. Cet outil de traduction, basé sur le transpilateur Clang, procède à l’analyse statique du code pour effectuer les choix de traduction appropriés. C2Space permet l’intégration en amont du flot d’une solution d’analyse dynamique de code C, appelée Pareon Profile. Pareon permet une première analyse du code séquentiel afin de repérer rapidement les portions du code susceptibles de profiter d’un passage sur coprocesseur (de par leur potentiel de parallélisme et leur poids dans l’exécution séquentielle). Une fois la traduction effectuée, le reste du flot est supporté par l’environnement de codéveloppement SpaceStudio, qui permet alors de procéder à l’exploration architecturale du code traduit. Il est alors possible de tester différents schémas d’allocation des modules aux ressources matérielles (processeurs, coprocesseurs). Les modules alloués en matériel sont alors synthétisés à l’aide d’un logiciel de synthèse haut-niveau. Le système complet peut ensuite être exporté vers une des plateformes matérielles supportées (par exemple, pour FPGA Xilinx). Nous avons confronté ce flot à un algorithme de détection de contours (filtre de Canny) afin d’en tester l’efficacité. Les résultats montrent que l’approche proposée permet un raffinement rapide de l’algorithme, de sa définition logique jusqu’à son implémentation. Nous n’avons pu accélérer, comme envisagé initialement, la vitesse d’exécution de l’algorithme, mais nous avançons plusieurs pistes pour expliquer ces résultats. Nous proposons une série de recommandations visant à améliorer à la fois C2Space et le flot proposé. Nous décrivons notamment l’utilisation d’une nouvelle métrique très tôt dans le flot de conception mettant en relation (1) le potentiel de parallélisme d’un segment de code, (2) la portion du temps d’exécution global de ce segment, (3) le coût de communications processeur/coprocesseur pour ce segment et (4) le temps d’exécution logiciel de cette même portion de code.----------ABSTRACT Since the beginning of Electronic System Design, one of the main goals pursued in research has been to provide smarter and faster tools and workflows. This is clearly shown in the gap that separates the number of transistor per chip and the number of transistor that can be integrated by an engineer in a month. Over the years, many methods allowing a designer to abstract more and more his design came to fruition. In addition to those methods, automatization of certain tedious and repetitive tasks in the design process appeared. For instance, the tools of High Level Synthesis allow a high level specification to be automatically translated into a working Register Transfer level design. Finally, numerous workflows arose to provide more definite framework in Electronic System Design. The workflow that we will tackle is the Electronic System Level Flow (ESL), which describe an incremental method that starts from the most abstract specification possible. Abstraction is then released in small increments until a fully hardware specification is obtained. However, there is no comprehensive tool incorporating all stages of the ESL flow. We therefore propose an assisted translation tool for C code in order to complete an ESL design flow as part of creating hardware coprocessors. The proposed tool, C2Space, allows the transition of a sequential C code to a code organized in SpaceStudio modules in a fast and configurable way. This translation tool, based on the transpiler Clang, performs static analysis of the code to perform the appropriate translation choices. C2Space allows the integration, upstream of the flow, of a dynamic code analysis solution for C, called Pareon Profile. Pareon allows a first analysis of the sequential code to quickly identify sections of code that could benefit from a passage on coprocessor (by their potential parallelism and their weight in the sequential execution). Once the translation is done, the rest of the workflow is supported by the co-design environment SpaceStudio, which allows for architectural exploration of the translated code. It is then possible to test different mappings of the modules to the hardware resources (processors, coprocessors). Modules that are allocated material are then synthesized using a high-level synthesis software. The complete system can then be exported to one of the supported hardware platforms (e.g., for Xilinx FPGAs). We compared this flow to an edge detection algorithm (Canny filter) to test its effectiveness. The results show that the proposed approach enables rapid refinement of the algorithm, from the logic definition to its implementation. We have not been able to accelerate, as originally envisioned, the execution speed of the algorithm but we are offering several possible explanations for these results. We provide a series of recommendations to improve both C2Space and the proposed workflow. We describe in particular the use of a new metric early in the design flow linking (1) the potential parallelism of a code segment, (2) the portion of the total execution time of this segment, (3) the cost of communications between processor and coprocessors for this segment and (4) the software runtime of that same piece of code)

PolyPublie