Search CORE

15 research outputs found

Genetic Improvement for Code Obfuscation

Author: Petke J
Publication venue: Genetic and Evolutionary Computation Conference (GECCO)
Publication date: 01/01/2016
Field of study

Genetic improvement (GI) is a relatively new area of software engineering and thus the extent of its applicability is yet to be explored. Although a growing interest in GI in recent years started with the work on automatic bug fixing, the area flourished when results on optimisation of non-functional software properties, such as efficiency and energy consumption, were published. Further success of GI in transplanting functionality from one program to another leads to a question: what other software engineering areas can benefit from the use of genetic improvement techniques? We propose to utilise GI for code obfuscation

UCL Discovery

Genetic Improvement of Software: a Comprehensive Survey

Author: Haraldsson S
Harman M
langdon W
Petke J
White D
Woodward J
Publication venue
Publication date: 01/06/2018
Field of study

Genetic improvement (GI) uses automated search to find improved versions of existing software. We present a comprehensive survey of this nascent field of research with a focus on the core papers in the area published between 1995 and 2015. We identified core publications including empirical studies, 96% of which use evolutionary algorithms (genetic programming in particular). Although we can trace the foundations of GI back to the origins of computer science itself, our analysis reveals a significant upsurge in activity since 2012. GI has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Moreover, we present examples of research work that lies on the boundary between GI and other areas, such as program transformation, approximate computing, and software repair, with the intention of encouraging further exchange of ideas between researchers in these fields

UCL Discovery

Visualising the Search Landscape of the Triangle Program

Author: CM Reidys
DE Goldberg
F Daolio
G Ochoa
JH Holland
M Harman
Y Jia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

High order mutation analysis of a software engineering benchmark, including schema and local optima networks, suggests program improvements may not be as hard to find as is often assumed. 1) Bit-wise genetic building blocks are not deceptive and can lead to all global optima. 2) There are many neutral networks, plateaux and local optima, nevertheless in most cases near the human written C source code there are hill climbing routes including neutral moves to solutions

Crossref

Stirling Online Research Repository (RIOXX)

UCL Discovery

Stirling Online Research Repository

Automatic control program creation using concurrent Evolutionary Computing

Author: Hart John K.
Publication venue
Publication date
Field of study

Over the past decade, Genetic Programming (GP) has been the subject of a significant amount of research, but this has resulted in the solution of few complex real -world problems. In this work, I propose that, for some relatively simple, non safety -critical embedded control applications, GP can be used as a practical alternative to software developed by humans. Embedded control software has become a branch of software engineering with distinct temporal, interface and resource constraints and requirements. This results in a characteristic software structure, and by examining this, the effective decomposition of an overall problem into a number of smaller, simpler problems is performed. It is this type of problem amelioration that is suggested as a method whereby certain real -world problems may be rendered into a soluble form suitable for GP. In the course of this research, the body of published GP literature was examined and the most important changes to the original GP technique of Koza are noted; particular focus is made upon GP techniques involving an element of concurrency -which is central to this work. This search highlighted few applications of GP for the creation of software for complex, real -world problems -this was especially true in the case of multi thread, multi output solutions. To demonstrate this Idea, a concurrent Linear GP (LGP) system was built that creates a multiple input -multiple output solution using a custom low -level evolutionary language set, combining both continuous and Boolean data types. The system uses a multi -tasking model to evolve and execute the required LGP code for each system output using separate populations: Two example problems -a simple fridge controller and a more complex washing machine controller are described, and the problems encountered and overcome during the successful solution of these problems, are detailed. The operation of the complete, evolved washing machine controller is simulated using a graphical LabVIEWapplication. The aim of this research is to propose a general purpose system for the automatic creation of control software for use in a range of problems from the target problem class -without requiring any system tuning: In order to assess the system search performance sensitivity, experiments were performed using various population and LGP string sizes; the experimental data collected was also used to examine the utility of abandoning stalled searches and restarting. This work is significant because it identifies a realistic application of GP that can ease the burden of finite human software design resources, whilst capitalising on accelerating computing potential

Bournemouth University Research Online

Genetic Improvement of Software: From Program Landscapes to the Automatic Improvement of a Live System

Author: Haraldsson Saemundur Oskar
Publication venue: University of Stirling
Publication date: 31/05/2017
Field of study

In today’s technology driven society, software is becoming increasingly important in more areas of our lives. The domain of software extends beyond the obvious domain of computers, tablets, and mobile phones. Smart devices and the internet-of-things have inspired the integra- tion of digital and computational technology into objects that some of us would never have guessed could be possible or even necessary. Fridges and freezers connected to social media sites, a toaster activated with a mobile phone, physical buttons for shopping, and verbally asking smart speakers to order a meal to be delivered. This is the world we live in and it is an exciting time for software engineers and computer scientists. The sheer volume of code that is currently in use has long since outgrown beyond the point of any hope for proper manual maintenance. The rate of which mobile application stores such as Google’s and Apple’s have expanded is astounding. The research presented here aims to shed a light on an emerging field of research, called Genetic Improvement ( GI ) of software. It is a methodology to change program code to improve existing software. This thesis details a framework for GI that is then applied to explore fitness landscape of bug fixing Python software, reduce execution time in a C ++ program, and integrated into a live system. We show that software is generally not fragile and although fitness landscapes for GI are flat they are not impossible to search in. This conclusion applies equally to bug fixing in small programs as well as execution time improvements. The framework’s application is shown to be transportable between programming languages with minimal effort. Additionally, it can be easily integrated into a system that runs a live web service. The work within this thesis was funded by EPSRC grant EP/J017515/1 through the DAASE project

Stirling Online Research Repository

Machine learning in compilers

Author: Leather Hugh
Publication venue: The University of Edinburgh
Publication date: 01/01/2011
Field of study

Tuning a compiler so that it produces optimised code is a difficult task because modern processors are complicated; they have a large number of components operating in parallel and each is sensitive to the behaviour of the others. Building analytical models on which optimisation heuristics can be based has become harder as processor complexity increased and this trend is bound to continue as the world moves towards further heterogeneous parallelism. Compiler writers need to spend months to get a heuristic right for any particular architecture and these days compilers often support a wide range of disparate devices. Whenever a new processor comes out, even if derived from a previous one, the compiler’s heuristics will need to be retuned for it. This is, typically, too much effort and so, in fact, most compilers are out of date. Machine learning has been shown to help; by running example programs, compiled in different ways, and observing how those ways effect program run-time, automatic machine learning tools can predict good settings with which to compile new, as yet unseen programs. The field is nascent, but has demonstrated significant results already and promises a day when compilers will be tuned for new hardware without the need for months of compiler experts’ time. Many hurdles still remain, however, and while experts no longer have to worry about the details of heuristic parameters, they must spend their time on the details of the machine learning process instead to get the full benefits of the approach. This thesis aims to remove some of the aspects of machine learning based compilers for which human experts are still required, paving the way for a completely automatic, retuning compiler. First, we tackle the most conspicuous area of human involvement; feature generation. In all previous machine learning works for compilers, the features, which describe the important aspects of each example to the machine learning tools, must be constructed by an expert. Should that expert choose features poorly, they will miss crucial information without which the machine learning algorithm can never excel. We show that not only can we automatically derive good features, but that these features out perform those of human experts. We demonstrate our approach on loop unrolling, and find we do better than previous work, obtaining XXX% of the available performance, more than the XXX% of previous state of the art. Next, we demonstrate a new method to efficiently capture the raw data needed for machine learning tasks. The iterative compilation on which machine learning in compilers depends is typically time consuming, often requiring months of compute time. The underlying processes are also noisy, so that most prior works fall into two categories; those which attempt to gather clean data by executing a large number of times and those which ignore the statistical validity of their data to keep experiment times feasible. Our approach, on the other hand guarantees clean data while adapting to the experiment at hand, needing an order of magnitude less work that prior techniques

Edinburgh Research Archive

Genetic Improvement of Software (Dagstuhl Seminar 18052)

Author: Forrest S
Langdon WB
Le Goues C
Petke J
Publication venue: Dagstuhl Publishing
Publication date: 01/01/2018
Field of study

We document the program and the immediate outcomes of Dagstuhl Seminar 18052 “Genetic Improvement of Software”. The seminar brought together researchers in Genetic Improvement (GI) and related areas of software engineering to investigate what is achievable with current technology and the current impediments to progress and how GI can affect the software development process. Several talks covered the state-of-the-art and work in progress. Seven emergent topics have been identified ranging from the nature of the GI search space through benchmarking and practical applications. The seminar has already resulted in multiple research paper publications. Four by participants of the seminar will be presented at the GI workshop co-located with the top conference in software engineering - ICSE. Several researchers started new collaborations, results of which we hope to see in the near future

UCL Discovery

Dagstuhl Research Online Publication Server

Otimização para reduzir o tamanho de código-fonte JavaScript

Author: Farzat Fabio de Almeida
Publication venue: 'Programa de Pos-graduacao em Ciencias Contabeis da UFRJ'
Publication date: 01/12/2018
Field of study

JavaScript is one of the most used programming languages for front-end development of Web application. The increase in complexity of front-end features brings concerns about performance, especially the load and execution time of JavaScript code. To reduce the size of JavaScript programs and, therefore, the time required to load and execute these programs in the front-end of Web applications. To characterize the variants of JavaScript programs and use this information to build a search procedure that scans such variants for smaller implementations that pass all test cases. We applied this procedure to 19 JavaScript programs varying from 92 to 15,602 LOC and observed reductions from 0.2% to 73.8% of the original code, as well as a relationship between the quality of a program’s test suite and the ability to reduce its size.Esta Tese aborda o problema de otimização de tempo de carga de software, especificamente software escrito na linguagem de programação JavaScript, uma linguagem interpretada, baseada em objetos e amplamente utilizada no desenvolvimento de aplicativos e sistemas para a internet. Estudos experimentais foram projetados para avaliar a hipótese de que técnicas heurísticas já aplicadas com sucesso em linguagens orientadas a objeto poderiam ter resultados positivos na redução do tempo de carga de programas escritos em JavaScript. Para tanto, um ferramental que permitisse observar a aplicação de heurísticas selecionadas em programas JavaScript foi construído e executado em um ambiente de computação de alto desempenho. Os resultados dos estudos preliminares foram utilizados para criar um procedimento de busca que varre o código JavaScript criando variantes do programa que sejam menores e passem em todos os casos de teste do programa original. Aplicamos este procedimento a 19 programas JavaScript, variando de 92 a 15.602 linhas de código, e observamos reduções de 0,2% a 73,8% do código original, bem como uma relação entre a qualidade do conjunto de casos de testes e a capacidade de reduzir o tamanho dos programas

Pantheon

Automated Software Transplantation

Author: Marginean A
Publication venue: UCL (University College London)
Publication date: 28/11/2021
Field of study

Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: µSCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours

UCL Discovery

The Blind Software Engineer: Improving the Non-Functional Properties of Software by Means of Genetic Improvement

Author: Bruce Bobby R.
Publication venue: UCL (University College London)
Publication date: 28/07/2018
Field of study

UCL Discovery