86 research outputs found
Analysing Symbolic Regression Benchmarks under a Meta-Learning Approach
The definition of a concise and effective testbed for Genetic Programming
(GP) is a recurrent matter in the research community. This paper takes a new
step in this direction, proposing a different approach to measure the quality
of the symbolic regression benchmarks quantitatively. The proposed approach is
based on meta-learning and uses a set of dataset meta-features---such as the
number of examples or output skewness---to describe the datasets. Our idea is
to correlate these meta-features with the errors obtained by a GP method. These
meta-features define a space of benchmarks that should, ideally, have datasets
(points) covering different regions of the space. An initial analysis of 63
datasets showed that current benchmarks are concentrated in a small region of
this benchmark space. We also found out that number of instances and output
skewness are the most relevant meta-features to GP output error. Both
conclusions can help define which datasets should compose an effective testbed
for symbolic regression methods.Comment: 8 pages, 3 Figures, Proceedings of Genetic and Evolutionary
Computation Conference Companion, Kyoto, Japa
Inheritance-Based Diversity Measures for Explicit Convergence Control in Evolutionary Algorithms
Diversity is an important factor in evolutionary algorithms to prevent
premature convergence towards a single local optimum. In order to maintain
diversity throughout the process of evolution, various means exist in
literature. We analyze approaches to diversity that (a) have an explicit and
quantifiable influence on fitness at the individual level and (b) require no
(or very little) additional domain knowledge such as domain-specific distance
functions. We also introduce the concept of genealogical diversity in a broader
study. We show that employing these approaches can help evolutionary algorithms
for global optimization in many cases.Comment: GECCO '18: Genetic and Evolutionary Computation Conference, 2018,
Kyoto, Japa
Combating catastrophic forgetting with developmental compression
Generally intelligent agents exhibit successful behavior across problems in
several settings. Endemic in approaches to realize such intelligence in
machines is catastrophic forgetting: sequential learning corrupts knowledge
obtained earlier in the sequence, or tasks antagonistically compete for system
resources. Methods for obviating catastrophic forgetting have sought to
identify and preserve features of the system necessary to solve one problem
when learning to solve another, or to enforce modularity such that minimally
overlapping sub-functions contain task specific knowledge. While successful,
both approaches scale poorly because they require larger architectures as the
number of training instances grows, causing different parts of the system to
specialize for separate subsets of the data. Here we present a method for
addressing catastrophic forgetting called developmental compression. It
exploits the mild impacts of developmental mutations to lessen adverse changes
to previously-evolved capabilities and `compresses' specialized neural networks
into a generalized one. In the absence of domain knowledge, developmental
compression produces systems that avoid overt specialization, alleviating the
need to engineer a bespoke system for every task permutation and suggesting
better scalability than existing approaches. We validate this method on a robot
control problem and hope to extend this approach to other machine learning
domains in the future
The Automated Design of Probabilistic Selection Methods for Evolutionary Algorithms
Selection functions enable Evolutionary Algorithms (EAs) to apply selection pressure to a population of individuals, by regulating the probability that an individual\u27s genes survive, typically based on fitness. Various conventional fitness based selection methods exist, each providing a unique relationship between the fitnesses of individuals in a population and their chances of selection. However, the full space of selection algorithms is only limited by max algorithm size, and each possible selection algorithm is optimal for some EA configuration applied to a particular problem class. Therefore, improved performance may be expected by tuning an EA\u27s selection algorithm to the problem at hand, rather than employing a conventional selection method. The objective of this paper is to investigate the extent to which performance can be improved by tuning selection algorithms, employing a Hyper-heuristic to explore the space of search algorithms which encode the relationships between the fitnesses of individuals and their probability of selection. We show the improved performance obtained versus conventional selection functions on fixed instances from a benchmark problem class, including separate testing instances to show generalization of the improved performance
Temporal Feature Selection with Symbolic Regression
Building and discovering useful features when constructing machine learning models is the central task for the machine learning practitioner. Good features are useful not only in increasing the predictive power of a model but also in illuminating the underlying drivers of a target variable. In this research we propose a novel feature learning technique in which Symbolic regression is endowed with a ``Range Terminal\u27\u27 that allows it to explore functions of the aggregate of variables over time. We test the Range Terminal on a synthetic data set and a real world data in which we predict seasonal greenness using satellite derived temperature and snow data over a portion of the Arctic. On the synthetic data set we find Symbolic regression with the Range Terminal outperforms standard Symbolic regression and Lasso regression. On the Arctic data set we find it outperforms standard Symbolic regression, fails to beat the Lasso regression, but finds useful features describing the interaction between Land Surface Temperature, Snow, and seasonal vegetative growth in the Arctic
Evolution of Network Enumeration Strategies in Emulated Computer Networks
Successful attacks on computer networks today do not often owe their victory to directly overcoming strong security measures set up by the defender. Rather, most attacks succeed because the number of possible vulnerabilities are too large for humans to fully protect without making a mistake. Regardless of the security elsewhere, a skilled attacker can exploit a single vulnerability in a defensive system and negate the benefits of those security measures. This paper presents an evolutionary framework for evolving attacker agents in a real, emulated network environment using genetic programming, as a foundation for coevolutionary systems which can automatically discover and mitigate network security flaws. We examine network enumeration, an initial network reconnaissance step, through our framework and present results demonstrating its success, indicating a broader applicability to further cyber-security tasks
A Black-Box Discrete Optimization Benchmarking (BB-DOB) Pipeline Survey: Taxonomy, Evaluation, and Ranking
This paper provides a taxonomical identification survey of classes in discrete optimization challenges that can be found in the literature including a proposed pipeline for benchmarking, inspired by previous computational optimization competitions. Thereby, a Black-Box Discrete Optimization Benchmarking (BB-DOB) perspective is presented for the BB-DOB@GECCO Workshop. It is motivated why certain classes together with their properties (like deception and separability or toy problem label) should be included in the perspective. Moreover, guidelines on how to select significant instances within these classes, the design of experiments setup, performance measures, and presentation methods and formats are discussed.authorsversio
A Distributed Epigenetic Shape Formation and Regeneration Algorithm for a Swarm of Robots
Living cells exhibit both growth and regeneration of body tissues. Epigenetic
Tracking (ET), models this growth and regenerative qualities of living cells
and has been used to generate complex 2D and 3D shapes. In this paper, we
present an ET based algorithm that aids a swarm of identically-programmed
robots to form arbitrary shapes and regenerate them when cut. The algorithm
works in a distributed manner using only local interactions and computations
without any central control and aids the robots to form the shape in a
triangular lattice structure. In case of damage or splitting of the shape, it
helps each set of the remaining robots to regenerate and position themselves to
build scaled down versions of the original shape. The paper presents the shapes
formed and regenerated by the algorithm using the Kilombo simulator.Comment: 8 pages, 9 figures, GECCO-18 conferenc
Automated Design of Network Security Metrics
Many abstract security measurements are based on characteristics of a graph that represents the network. These are typically simple and quick to compute but are often of little practical use in making real-world predictions. Practical network security is often measured using simulation or real-world exercises. These approaches better represent realistic outcomes but can be costly and time-consuming. This work aims to combine the strengths of these two approaches, developing efficient heuristics that accurately predict attack success. Hyper-heuristic machine learning techniques, trained on network attack simulation training data, are used to produce novel graph-based security metrics. These low-cost metrics serve as an approximation for simulation when measuring network security in real time. The approach is tested and verified using a simulation based on activity from an actual large enterprise network. The results demonstrate the potential of using hyper-heuristic techniques to rapidly evolve and react to emerging cybersecurity threats
- …