14 research outputs found
The evolution of complete software systems
This thesis tackles a series of problems related to the evolution of completesoftware systems both in terms of the underlying Genetic Programmingsystem and the application of that system. A new representation is presented that addresses some of the issues withother Genetic Program representations while keeping their advantages. Thiscombines the easy reproduction of the linear representation with the inheritablecharacteristics of the tree representation by using fixed-length blocks ofgenes representing single program statements. This means that each block ofgenes will always map to the same statement in the parent and child unless itis mutated, irrespective of changes to the surrounding blocks. This methodis compared to the variable length gene blocks used by other representationswith a clear improvement in the similarity between parent and child. Traditionally, fitness functions have either been created as a selection ofsample inputs with known outputs or as hand-crafted evaluation functions. Anew method of creating fitness evaluation functions is introduced that takesthe formal specification of the desired function as its basis. This approachensures that the fitness function is complete and concise. The fitness functionscreated from formal specifications are compared to simple input/outputpairs and the results show that the functions created from formal specificationsperform significantly better. A set of list evaluation and manipulation functions was evolved as anapplication of the new Genetic Program components. These functions havethe common feature that they all need to be 100% correct to be useful. Traditional Genetic Programming problems have mainly been optimizationor approximation problems. The list results are good but do highlight theproblem of scalability in that more complex functions lead to a dramaticincrease in the required evolution time. Finally, the evolution of graphical user interfaces is addressed. The representationfor the user interfaces is based on the new representation forprograms. In this case each gene block represents a component of the userinterface. The fitness of the interface is determined by comparing it to a seriesof constraints, which specify the layout, style and functionality requirements. A selection of web-based and desktop-based user interfaces were evolved. With these new approaches to Genetic Programming, the evolution ofcomplete software systems is now a realistic goal.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
An improved representation for evolving programs
A representation has been developed that addresses some of the issues
with other Genetic Program representations while maintaining their advantages.
This combines the easy reproduction of the linear representation with the inherita-
ble characteristics of the tree representation by using fixed-length blocks of genes
representing single program statements. This means that each block of genes will
always map to the same statement in the parent and child unless it is mutated,
irrespective of changes to the surrounding blocks. This method is compared to the
variable length gene blocks used by other representations with a clear improvement
in the similarity between parent and child. In addition, a set of list evaluation and
manipulation functions was evolved as an application of the new Genetic Program
components. These functions have the common feature that they all need to be 100%
correct to be useful. Traditional Genetic Programming problems have mainly been
optimization or approximation problems. The list results are good but do highlight
the problem of scalability in that more complex functions lead to a dramatic increase
in the required evolution time
Comparing content-filter techniques for stopping spam
There are many new theoretical techniques
for detecting spam e-mail based
upon the message contents. Although
Bayesian methods are the most wellknown,
there are other approaches for
classifying information. This paper establishes
some criteria for measuring
spam filter effectiveness and compares the
Boosting and Support Vector Machine
approaches with some well-known existing
filter software. It also examines ways
of transforming e-mail messages into a
form which is more readily processable by
such algorithms
Evolving readable Perl
A program is informally deemed readable, for the purpose
of this experiment, if it is easy for a person to
follow the steps that the program takes to solve the
problem. In this experiment, readability is achieved
by constraining the available syntax for generating solutions.
The Genetic Programming (GP) system created uses
the target language Perl because it is an interpreted,
untyped, robust procedural language which has good
error handling and recovery
Evolving Perl
A list of requirements for a genetic programming
representation is put forward and a representation
separating the genotype and phenotype
with a linear genome is presented.
The target language for the genetic program
is Perl. The mapping process, between the
genotype and phenotype, converts blocks of
four genes into program statements. This
process is context-free and therefore provides
inheritable characteristics. The representation
is tested by evolving a selection of list
evaluation and manipulation functions which
are all evolved from the same language subset,
with good results
Evolving the user interface
A method is presented for evolving
Graphical User Interfaces using Genetic
Algorithms. The fitness evaluation is
based on a series of constraints, which
must be met by the user interface. Examples are used to demonstrate the use
of positional, style and functionality constraints and the final example shows the
evolution of a complete (although simple)
software application
Packet transmission optimisation using Genetic Algorithms
A Genetic Algorithm (ga) is used to optimise the parameters for a sequence of packets sent over the Internet. Only the parameters
that a client machine can change are used and the fitness is based on the
delay time returned by the Traceroute program. The ga performance is
compared to a fixed packet size with no priority used to assess the status
of the network. The ga generally performed to the same level as the
control settings but in some cases significant improvements were made
Honey Plotter and the Web of Terror
Honeypots are a useful tool for discovering the
distribution of malicious traffic on the Internet and how that
traffic evolves over time. In addition, they allow an insight into
new attacks appearing. One major problem is analysing the large
amounts of data generated by such honeypots and correlating
between multiple honeypots. Honey Plotter is a web-based query
and visualisation tool to allow investigation into data gathered by
a distributed honeypot network. It is built on top of a relational
database, which allows great flexibility in the questions that can
be asked and has automatic generation of visualisations based on
the results of queries. The main focus is on aggregate statistics but
individual attacks can also be analysed. Statistical comparison of
distributions is also provided to assist with detecting anomalies
in the data; helping separate out common malicious traffic from
new threats and trends. Two short case studies are presented to
give an example of the types of analysis that can be performed
Automating rolling stock diagramming and platform allocation
Rolling stock allocation is the process of assigning timetable schedules to physical train units. This is primarily done by connecting together schedules at their terminal locations (known as schedule associations). Platforming allocation is the process of assigning those associations to particular platforms. A simple last-in, first-legal-out algorithm is used for rolling stock allocation that performs comparably to the traditional manual approach but only takes a few seconds as opposed to days or weeks in many manual cases. A simple stochastic hill-climbing approach is used for assigning associations to platforms to provide a conflict-free platform allocation within a few seconds. These two approaches are tested on real train planning problems with excellent results that would allow an expert to rapidly produce optimal or near optimal solutions. The time saving using these approaches can be used by the train planner to try out various options or have greater checking of robustness of the solutions created
Train timetable generation using genetic algorithms
The scheduling of railway trains has been a research problem for many years. Many of the choices required are
not known a priori and require exploration of the problem to determine them. A modular Genetic system was
designedmake the evaluation function and preparation of the timetable tractable. The Genetic system consists
of a Genome, split into Chromosomes so the extra choices that become known throughout the evolution can be
added to the Chromosomes. A weighted fitness function and a multiobjective non-dominated fitness function
were tried, and then partial objective ranking was added. The system has tackled a mixture of problems has
produced promising results