977 research outputs found
An Overview of Schema Theory
The purpose of this paper is to give an introduction to the field of Schema
Theory written by a mathematician and for mathematicians. In particular, we
endeavor to to highlight areas of the field which might be of interest to a
mathematician, to point out some related open problems, and to suggest some
large-scale projects. Schema theory seeks to give a theoretical justification
for the efficacy of the field of genetic algorithms, so readers who have
studied genetic algorithms stand to gain the most from this paper. However,
nothing beyond basic probability theory is assumed of the reader, and for this
reason we write in a fairly informal style.
Because the mathematics behind the theorems in schema theory is relatively
elementary, we focus more on the motivation and philosophy. Many of these
results have been proven elsewhere, so this paper is designed to serve a
primarily expository role. We attempt to cast known results in a new light,
which makes the suggested future directions natural. This involves devoting a
substantial amount of time to the history of the field.
We hope that this exposition will entice some mathematicians to do research
in this area, that it will serve as a road map for researchers new to the
field, and that it will help explain how schema theory developed. Furthermore,
we hope that the results collected in this document will serve as a useful
reference. Finally, as far as the author knows, the questions raised in the
final section are new.Comment: 27 pages. Originally written in 2009 and hosted on my website, I've
decided to put it on the arXiv as a more permanent home. The paper is
primarily expository, so I don't really know where to submit it, but perhaps
one day I will find an appropriate journa
An improved representation for evolving programs
A representation has been developed that addresses some of the issues
with other Genetic Program representations while maintaining their advantages.
This combines the easy reproduction of the linear representation with the inherita-
ble characteristics of the tree representation by using fixed-length blocks of genes
representing single program statements. This means that each block of genes will
always map to the same statement in the parent and child unless it is mutated,
irrespective of changes to the surrounding blocks. This method is compared to the
variable length gene blocks used by other representations with a clear improvement
in the similarity between parent and child. In addition, a set of list evaluation and
manipulation functions was evolved as an application of the new Genetic Program
components. These functions have the common feature that they all need to be 100%
correct to be useful. Traditional Genetic Programming problems have mainly been
optimization or approximation problems. The list results are good but do highlight
the problem of scalability in that more complex functions lead to a dramatic increase
in the required evolution time
An investigation into the structure of genomes within an evolution that uses embryogenesis
Evolutionary algorithms that use embryogenesis in the creation of individuals have several desirable qualities. Such algorithms are able to create complex, modular designs which can scale well to large problems. However, the inner workings of developmental algorithms have not been investigated as thoroughly as their direct-encoding counterparts. More precisely, it would be beneficial to look at how the rules used during embryogenesis evolve alongside the phenotypes they produced. This paper reports on such an investigation into the evolution of a rule set for the growth of an artificial neural network, and identifies several aspects that are desirable for the genomes of a developmental evolutionary algorithm
Dependency structure matrix, genetic algorithms, and effective recombination
In many different fields, researchers are often confronted by problems arising from complex systems. Simple heuristics or even enumeration works quite well on small and easy problems; however, to efficiently solve large and difficult problems, proper decomposition is the key. In this paper, investigating and analyzing interactions between components of complex systems shed some light on problem decomposition. By recognizing three bare-bones interactions-modularity, hierarchy, and overlap, facet-wise models arc developed to dissect and inspect problem decomposition in the context of genetic algorithms. The proposed genetic algorithm design utilizes a matrix representation of an interaction graph to analyze and explicitly decompose the problem. The results from this paper should benefit research both technically and scientifically. Technically, this paper develops an automated dependency structure matrix clustering technique and utilizes it to design a model-building genetic algorithm that learns and delivers the problem structure. Scientifically, the explicit interaction model describes the problem structure very well and helps researchers gain important insights through the explicitness of the procedure.This work was sponsored by Taiwan National Science Council under grant NSC97-
2218-E-002-020-MY3, U.S. Air Force Office of Scientific Research, Air Force Material
Command, USAF, under grants FA9550-06-1-0370 and FA9550-06-1-0096, U.S. National
Science Foundation under CAREER grant ECS-0547013, ITR grant DMR-03-25939 at
Materials Computation Center, grant ISS-02-09199 at US National Center for Supercomputing Applications, UIUC, and the Portuguese Foundation for Science and Technology
under grants SFRH/BD/16980/2004 and PTDC/EIA/67776/2006
Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks
Biological plastic neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence. Inspired by such intricate natural phenomena, Evolved Plastic
Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed
plastic neural networks with a large variety of dynamics, architectures, and
plasticity rules: these artificial systems are composed of inputs, outputs, and
plastic components that change in response to experiences in an environment.
These systems may autonomously discover novel adaptive algorithms, and lead to
hypotheses on the emergence of biological adaptation. EPANNs have seen
considerable progress over the last two decades. Current scientific and
technological advances in artificial neural networks are now setting the
conditions for radically new approaches and results. In particular, the
limitations of hand-designed networks could be overcome by more flexible and
innovative solutions. This paper brings together a variety of inspiring ideas
that define the field of EPANNs. The main methods and results are reviewed.
Finally, new opportunities and developments are presented
A Neat Approach To Genetic Programming
The evolution of explicitly represented topologies such as graphs involves devising methods for mutating, comparing and combining structures in meaningful ways and identifying and maintaining the necessary topological diversity. Research has been conducted in the area of the evolution of trees in genetic programming and of neural networks and some of these problems have been addressed independently by the different research communities. In the domain of neural networks, NEAT (Neuroevolution of Augmenting Topologies) has shown to be a successful method for evolving increasingly complex networks. This system\u27s success is based on three interrelated elements: speciation, marking of historical information in topologies, and initializing search in a small structures search space. This provides the dynamics necessary for the exploration of diverse solution spaces at once and a way to discriminate between different structures. Although different representations have emerged in the area of genetic programming, the study of the tree representation has remained of interest in great part because of its mapping to programming languages and also because of the observed phenomenon of unnecessary code growth or bloat which hinders performance. The structural similarity between trees and neural networks poses an interesting question: Is it possible to apply the techniques from NEAT to the evolution of trees and if so, how does it affect performance and the dynamics of code growth? In this work we address these questions and present analogous techniques to those in NEAT for genetic programming
Recommended from our members
Interpreting Deep Learning for cell differentiation. Supervised and Unsupervised models viewed through the lens of information and perturbation theory.
"Predicting the future isn't magic, it's artificial intelligence" Dave Waters.
In the last decades there has been an unprecedented growth in the field of machine learning, and particularly within deep learning models. The combination of big data and computational power has nurtured the evolution of a variety of new methods to predict and interpret future scenarios. These data centric models can achieve exceptional performances on specific tasks, with their prediction boundaries continuously expanding towards new and more complex challenges.
However, the model complexity often translates into a lack of interpretability from a scientific c perspective, it is not trivial to identify the factors involved in final outcomes.
Explainability may not always be a requirement for some machine learning tasks, specially when it comes in detriment of performance power. But for some applications, such as biological discoveries or medical diagnostics, understanding the output and determining factors that influence decisions is essential.
In this thesis we develop both a supervised and unsupervised approach to map from genotype to phenotype. We emphasise the importance of interpretability and feature extraction from the models, by identifying relevant genes for cell differentiation. We then continue to explore the rules and mechanisms behind the models from a theoretical perspective. Using information theory to explain the learning process and applying
perturbation theory to transform the results into a generalisable representation.
We start by building a supervised approach to mapping cell profiles from genotype to phenotype, using single cell RNA-Seq data. We leverage non-linearities among gene expressions to identify cellular levels of differentiation. The ambiguity and even absence of labels in most biological studies instigated the development of novel unsupervised techniques, leading to a new general and biologically interpretable framework based on Variational Autoencoders.
The application and validation of the methods has proven to be successful, but questions regarding the learning process and generative nature of the results remained unanswered. I use information theory to define a new approach to interpret training and the converged solutions of our models.
The variational and generative nature of Autoencoders provides a platform to develop general models. Their results should extrapolate and allow generalisation beyond the boundaries of the observed data. To this extent, we introduce for the first time a new interpretation of the embedded generative functions through Perturbation Theory. The embedding multiplicity is addressed by transforming the distributions into a new set of generalisable functions, while characterising their energy spectrum
under a particular energy landscape.
We outline the combination of theoretical and machine learning based methods, for moving towards interpretable and generalisable models. Developing a theoretical framework to map from genotype to phenotype, we provide both supervised and unsupervised tools to operate over single cell RNA-Seq. data. We have generated a pipeline to identify relevant genes and cell types through Variational Autoencoders (VAEs),
validating reconstructed gene expressions to prove the generative performance of the embeddings. The new interpretation of the information learned and extracted by the models de fines a label independent evaluation, particularly useful for unsupervised
learning. Lastly, we introduce a novel transformation of the generative embeddings based on quantum and perturbation theory.
Our contributions can and have been extended to new datasets, according to the nature of the tasks being explored. For instance, the combination of unsupervised learning and information theory can be applied to a variety of biological or medical data. We have trained several VAE models with additional cancer and metabolic data, proving to extract meaningful representations of the data. The perturbation theory transformation of the embedding can also lead to future research on the generative potential of Variational Autoencoders through a physics perspective, combining statistical and quantum mechanics.
We believe that machine learning will only continue its fast expansion and growth through the development of more generalisable more interpretable models.
"Prediction is very difficult, especially if it's about the future" Niels Boh
- …