120 research outputs found
Semantic variation operators for multidimensional genetic programming
Multidimensional genetic programming represents candidate solutions as sets
of programs, and thereby provides an interesting framework for exploiting
building block identification. Towards this goal, we investigate the use of
machine learning as a way to bias which components of programs are promoted,
and propose two semantic operators to choose where useful building blocks are
placed during crossover. A forward stagewise crossover operator we propose
leads to significant improvements on a set of regression problems, and produces
state-of-the-art results in a large benchmark study. We discuss this
architecture and others in terms of their propensity for allowing heuristic
search to utilize information during the evolutionary process. Finally, we look
at the collinearity and complexity of the data representations that result from
these architectures, with a view towards disentangling factors of variation in
application.Comment: 9 pages, 8 figures, GECCO 201
Genetic programming approaches to learning fair classifiers
Society has come to rely on algorithms like classifiers for important
decision making, giving rise to the need for ethical guarantees such as
fairness. Fairness is typically defined by asking that some statistic of a
classifier be approximately equal over protected groups within a population. In
this paper, current approaches to fairness are discussed and used to motivate
algorithmic proposals that incorporate fairness into genetic programming for
classification. We propose two ideas. The first is to incorporate a fairness
objective into multi-objective optimization. The second is to adapt lexicase
selection to define cases dynamically over intersections of protected groups.
We describe why lexicase selection is well suited to pressure models to perform
well across the potentially infinitely many subgroups over which fairness is
desired. We use a recent genetic programming approach to construct models on
four datasets for which fairness constraints are necessary, and empirically
compare performance to prior methods utilizing game-theoretic solutions.
Methods are assessed based on their ability to generate trade-offs of subgroup
fairness and accuracy that are Pareto optimal. The result show that genetic
programming methods in general, and random search in particular, are well
suited to this task.Comment: 9 pages, 7 figures. GECCO 202
A Black-Box Discrete Optimization Benchmarking (BB-DOB) Pipeline Survey: Taxonomy, Evaluation, and Ranking
This paper provides a taxonomical identification survey of classes in discrete optimization challenges that can be found in the literature including a proposed pipeline for benchmarking, inspired by previous computational optimization competitions. Thereby, a Black-Box Discrete Optimization Benchmarking (BB-DOB) perspective is presented for the BB-DOB@GECCO Workshop. It is motivated why certain classes together with their properties (like deception and separability or toy problem label) should be included in the perspective. Moreover, guidelines on how to select significant instances within these classes, the design of experiments setup, performance measures, and presentation methods and formats are discussed.authorsversio
Lexicase selection in Learning Classifier Systems
The lexicase parent selection method selects parents by considering
performance on individual data points in random order instead of using a
fitness function based on an aggregated data accuracy. While the method has
demonstrated promise in genetic programming and more recently in genetic
algorithms, its applications in other forms of evolutionary machine learning
have not been explored. In this paper, we investigate the use of lexicase
parent selection in Learning Classifier Systems (LCS) and study its effect on
classification problems in a supervised setting. We further introduce a new
variant of lexicase selection, called batch-lexicase selection, which allows
for the tuning of selection pressure. We compare the two lexicase selection
methods with tournament and fitness proportionate selection methods on binary
classification problems. We show that batch-lexicase selection results in the
creation of more generic rules which is favorable for generalization on future
data. We further show that batch-lexicase selection results in better
generalization in situations of partial or missing data.Comment: Genetic and Evolutionary Computation Conference, 201
Recommended from our members
General Program Synthesis from Examples Using Genetic Programming with Parent Selection Based on Random Lexicographic Orderings of Test Cases
Software developers routinely create tests before writing code, to ensure that their programs fulfill their requirements. Instead of having human programmers write the code to meet these tests, automatic program synthesis systems can create programs to meet specifications without human intervention, only requiring examples of desired behavior. In the long-term, we envision using genetic programming to synthesize large pieces of software. This dissertation takes steps toward this goal by investigating the ability of genetic programming to solve introductory computer science programming problems.
We present a suite of 29 benchmark problems intended to test general program synthesis systems, which we systematically selected from sources of introductory computer science programming problems. This suite is suitable for experiments with any program synthesis system driven by input/output examples. Unlike existing benchmarks that concentrate on constrained problem domains such as list manipulation, symbolic regression, or boolean functions, this suite contains general programming problems that require a range of programming constructs, such as multiple data types and data structures, control flow statements, and I/O. The problems encompass a range of difficulties and requirements as necessary to thoroughly assess the capabilities of a program synthesis system. Besides describing the specifications for each problem, we make recommendations for experimental protocols and statistical methods to use with the problems.
This dissertation\u27s second contribution is an investigation of behavior-based parent selection in genetic programming, concentrating on a new method called lexicase selection. Most parent selection techniques aggregate errors from test cases to compute a single scalar fitness value; lexicase selection instead treats test cases separately, never comparing error values of different test cases. This property allows it to select parents that specialize on some test cases even if they perform poorly on others. We compare lexicase selection to other parent selection techniques on our benchmark suite, showing better performance for lexicase selection. After observing that lexicase selection increases exploration of the search space while also increasing exploitation of promising programs, we conduct a range of experiments to identify which characteristics of lexicase selection influence its utility
How Fast Can We Play Tetris Greedily With Rectangular Pieces?
Consider a variant of Tetris played on a board of width and infinite
height, where the pieces are axis-aligned rectangles of arbitrary integer
dimensions, the pieces can only be moved before letting them drop, and a row
does not disappear once it is full. Suppose we want to follow a greedy
strategy: let each rectangle fall where it will end up the lowest given the
current state of the board. To do so, we want a data structure which can always
suggest a greedy move. In other words, we want a data structure which maintains
a set of rectangles, supports queries which return where to drop the
rectangle, and updates which insert a rectangle dropped at a certain position
and return the height of the highest point in the updated set of rectangles. We
show via a reduction to the Multiphase problem [P\u{a}tra\c{s}cu, 2010] that on
a board of width , if the OMv conjecture [Henzinger et al., 2015]
is true, then both operations cannot be supported in time
simultaneously. The reduction also implies polynomial bounds from the 3-SUM
conjecture and the APSP conjecture. On the other hand, we show that there is a
data structure supporting both operations in time on
boards of width , matching the lower bound up to a factor.Comment: Correction of typos and other minor correction
- …