17 research outputs found
Genetic improvement of programs
Genetic programming can optimise software, including: evolving test benchmarks, generating hyper-heuristics by searching meta-heuristics, generating communication protocols, composing telephony systems and web services, generating improved hashing and C++ heap managers, redundant programming and even automatic bug fixing. Particularly in embedded real-time or mobile systems, there may be many ways to trade off expenses (such as time, memory, energy, power consumption) vs. Functionality. Human programmers cannot try them all. Also the best multi-objective Pareto trade off may change with time, underlying hardware and network connection or user behaviour. It may be GP can automatically suggest different trade offs for each new market. Recent results include substantial speed up by evolving a new version of a program customised for a special case
Genetic improvement of GPU software
We survey genetic improvement (GI) of general purpose computing on graphics cards. We summarise several experiments which demonstrate four themes. Experiments with the gzip program show that genetic programming can automatically port sequential C code to parallel code. Experiments with the StereoCamera program show that GI can upgrade legacy parallel code for new hardware and software. Experiments with NiftyReg and BarraCUDA show that GI can make substantial improvements to current parallel CUDA applications. Finally, experiments with the pknotsRG program show that with semi-automated approaches, enormous speed ups can sometimes be had by growing and grafting new code with genetic programming in combination with human input
Genetic improvement: A key challenge for evolutionary computation
Automatic Programming has long been a sub-goal of Artificial Intelligence (AI). It is feasible in limited domains. Genetic Improvement (GI) has expanded these dramatically to more than 100 000 lines of code by building on human written applications. Further scaling may need key advances in both Search Based Software Engineering (SBSE) and Evolutionary Computation (EC) research, particularly on representations, genetic operations, fitness landscapes, fitness surrogates, multi objective search and co-evolution
A mapping study on documentation in Continuous Software Development
Context: With an increase in Agile, Lean, and DevOps software methodologies over the last years (collectively referred to as Continuous Software Development (CSD)), we have observed that documentation is often poor. Objective: This work aims at collecting studies on documentation challenges, documentation practices, and tools that can support documentation in CSD. Method: A systematic mapping study was conducted to identify and analyze research on documentation in CSD, covering publications between 2001 and 2019. Results: A total of 63 studies were selected. We found 40 studies related to documentation practices and challenges, and 23 studies related to tools used in CSD. The challenges include: informal documentation is hard to understand, documentation is considered as waste, productivity is measured by working software only, documentation is out-of-sync with the software and there is a short-term focus. The practices include: non-written and informal communication, the usage of development artifacts for documentation, and the use of architecture frameworks. We also made an inventory of numerous tools that can be used for documentation purposes in CSD. Overall, we recommend the usage of executable documentation, modern tools and technologies to retrieve information and transform it into documentation, and the practice of minimal documentation upfront combined with detailed design for knowledge transfer afterwards. Conclusion: It is of paramount importance to increase the quantity and quality of documentation in CSD. While this remains challenging, practitioners will benefit from applying the identified practices and tools in order to mitigate the stated challenges
Genetic Improvement of Software: From Program Landscapes to the Automatic Improvement of a Live System
In today’s technology driven society, software is becoming increasingly important in more
areas of our lives. The domain of software extends beyond the obvious domain of computers,
tablets, and mobile phones. Smart devices and the internet-of-things have inspired the integra-
tion of digital and computational technology into objects that some of us would never have
guessed could be possible or even necessary. Fridges and freezers connected to social media
sites, a toaster activated with a mobile phone, physical buttons for shopping, and verbally
asking smart speakers to order a meal to be delivered. This is the world we live in and it is an
exciting time for software engineers and computer scientists. The sheer volume of code that is
currently in use has long since outgrown beyond the point of any hope for proper manual
maintenance. The rate of which mobile application stores such as Google’s and Apple’s have
expanded is astounding.
The research presented here aims to shed a light on an emerging field of research, called
Genetic Improvement ( GI ) of software. It is a methodology to change program code to improve
existing software. This thesis details a framework for GI that is then applied to explore fitness
landscape of bug fixing Python software, reduce execution time in a C ++ program, and
integrated into a live system.
We show that software is generally not fragile and although fitness landscapes for GI are
flat they are not impossible to search in. This conclusion applies equally to bug fixing in small
programs as well as execution time improvements. The framework’s application is shown to
be transportable between programming languages with minimal effort. Additionally, it can be
easily integrated into a system that runs a live web service.
The work within this thesis was funded by EPSRC grant EP/J017515/1 through the DAASE
project
Exact analysis for requirements selection and optimisation
Requirements engineering is the prerequisite of software engineering, and plays a crit- ically strategic role in the success of software development. Insufficient management of uncertainty in the requirements engineering process has been recognised as a key reason for software project failure. The essence of uncertainty may arise from partially observable, stochastic environments, or ignorance. To ease the impact of uncertainty in the software development process, it is important to provide techniques that explicitly manage uncertainty in requirements selection and optimisation. This thesis presents a decision support framework to exactly address the uncertainty in requirements selection and optimisation. Three types of uncertainty are managed. They are requirements uncertainty, algorithmic uncertainty, and uncertainty of resource constraints. Firstly, a probabilistic robust optimisation model is introduced to enable the manageability of requirements uncertainty. Requirements uncertainty is probabilis- tically simulated by Monte-Carlo Simulation and then formulated as one of the opti- misation objectives. Secondly, a probabilistic uncertainty analysis and a quantitative analysis sub-framework METRO is designed to cater for requirements selection deci- sion support under uncertainty. An exact Non-dominated Sorting Conflict Graph based Dynamic Programming algorithm lies at the heart of METRO to guarantee the elim- ination of algorithmic uncertainty and the discovery of guaranteed optimal solutions. Consequently, any information loss due to algorithmic uncertainty can be completely avoided. Moreover, a data analytic approach is integrated in METRO to help the deci- sion maker to understand the remaining requirements uncertainty propagation through- out the requirements selection process, and to interpret the analysis results. Finally, a more generic exact multi-objective integrated release and schedule planning approach iRASPA is introduced to holistically manage the uncertainty of resource constraints for requirements selection and optimisation. Software release and schedule plans are inte- grated into a single activity and solved simultaneously. Accordingly, a more advanced globally optimal result can be produced by accommodating and managing the inherent additional uncertainty due to resource constraints as well as that due to requirements. To settle the algorithmic uncertainty problem and guarantee the exactness of results, an ε-constraint Quadratic Programming approach is used in iRASPA
Many-Objective Optimization of Non-Functional Attributes based on Refactoring of Software Models
Software quality estimation is a challenging and time-consuming activity, and
models are crucial to face the complexity of such activity on modern software
applications. In this context, software refactoring is a crucial activity
within development life-cycles where requirements and functionalities rapidly
evolve. One main challenge is that the improvement of distinctive quality
attributes may require contrasting refactoring actions on software, as for
trade-off between performance and reliability (or other non-functional
attributes). In such cases, multi-objective optimization can provide the
designer with a wider view on these trade-offs and, consequently, can lead to
identify suitable refactoring actions that take into account independent or
even competing objectives. In this paper, we present an approach that exploits
NSGA-II as the genetic algorithm to search optimal Pareto frontiers for
software refactoring while considering many objectives. We consider performance
and reliability variations of a model alternative with respect to an initial
model, the amount of performance antipatterns detected on the model
alternative, and the architectural distance, which quantifies the effort to
obtain a model alternative from the initial one. We applied our approach on two
case studies: a Train Ticket Booking Service, and CoCoME. We observed that our
approach is able to improve performance (by up to 42\%) while preserving or
even improving the reliability (by up to 32\%) of generated model alternatives.
We also observed that there exists an order of preference of refactoring
actions among model alternatives. We can state that performance antipatterns
confirmed their ability to improve performance of a subject model in the
context of many-objective optimization. In addition, the metric that we adopted
for the architectural distance seems to be suitable for estimating the
refactoring effort.Comment: Accepted for publication in Information and Software Technologies.
arXiv admin note: substantial text overlap with arXiv:2107.0612
Automated Software Transplantation
Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: µSCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours