1,098 research outputs found
Problem Understanding through Landscape Theory
In order to understand the structure of a problem we need to measure some features of the problem. Some examples of measures suggested in the past are autocorrelation and fitness-distance correlation. Landscape theory, developed in the last years in the field of combinatorial optimization, provides mathematical expressions to efficiently compute statistics on optimization problems. In this paper we discuss how can we use optimización combinatoria in the context of problem understanding and present two software tools that can be used to efficiently compute the mentioned measures.Ministerio de Economía y Competitividad (TIN2011-28194
Fitness landscape of the cellular automata majority problem: View from the Olympus
In this paper we study cellular automata (CAs) that perform the computational
Majority task. This task is a good example of what the phenomenon of emergence
in complex systems is. We take an interest in the reasons that make this
particular fitness landscape a difficult one. The first goal is to study the
landscape as such, and thus it is ideally independent from the actual
heuristics used to search the space. However, a second goal is to understand
the features a good search technique for this particular problem space should
possess. We statistically quantify in various ways the degree of difficulty of
searching this landscape. Due to neutrality, investigations based on sampling
techniques on the whole landscape are difficult to conduct. So, we go exploring
the landscape from the top. Although it has been proved that no CA can perform
the task perfectly, several efficient CAs for this task have been found.
Exploiting similarities between these CAs and symmetries in the landscape, we
define the Olympus landscape which is regarded as the ''heavenly home'' of the
best local optima known (blok). Then we measure several properties of this
subspace. Although it is easier to find relevant CAs in this subspace than in
the overall landscape, there are structural reasons that prevent a searcher
from finding overfitted CAs in the Olympus. Finally, we study dynamics and
performance of genetic algorithms on the Olympus in order to confirm our
analysis and to find efficient CAs for the Majority problem with low
computational cost
A Study of Generalization and Fitness Landscapes for Neuroevolution
Rodrigues, N. M., Silva, S., & Vanneschi, L. (2020). A Study of Generalization and Fitness Landscapes for Neuroevolution. IEEE Access, 8, 108216-108234. [9113453]. https://doi.org/10.1109/ACCESS.2020.3001505Fitness landscapes are a useful concept for studying the dynamics of meta-heuristics. In the last two decades, they have been successfully used for estimating the optimization capabilities of different flavors of evolutionary algorithms, including genetic algorithms and genetic programming. However, so far they have not been used for studying the performance of machine learning algorithms on unseen data, and they have not been applied to studying neuroevolution landscapes. This paper fills these gaps by applying fitness landscapes to neuroevolution, and using this concept to infer useful information about the learning and generalization ability of the machine learning method. For this task, we use a grammar-based approach to generate convolutional neural networks, and we study the dynamics of three different mutations used to evolve them. To characterize fitness landscapes, we study autocorrelation, entropic measure of ruggedness, and fitness clouds. Also, we propose the use of two additional evaluation measures: density clouds and overfitting measure. The results show that these measures are appropriate for estimating both the learning and the generalization ability of the considered neuroevolution configurations.publishersversionpublishe
Exploring neuroevolution fitness landscapes for optimization and generalization
Tese de mestrado, Engenharia Informática (Interação e Conhecimento) Universidade de Lisboa, Faculdade de Ciências, 2020Paisagens de aptidão (fitness landscapes) são um conceito útil e largamente investigado para estudar as dinâmicas de meta-heurísticas. Nas últimas duas décadas têm sido utilizadas com sucesso para estimar as capacidades de otimização de diversos tipos de algoritmos evolutivos, tal como algoritmos genéticos e programação genética. No entanto, até à data nunca foram utilizadas para estudar o desempenho de algoritmos de aprendizagem automática em dados nunca vistos durante o treino, e nunca foram aplicadas para estudar as paisagens geradas por neuroevolução. Coincidentemente, apesar de já existir há quase três décadas e ainda ser uma área de investigação com um crescimento rápido e dinâmico, a neuroevolução ainda tem falta de fundações teóricas e metodológicas, fundações essas que podem ser dadas através da aplicação de paisagens de aptidão. Esta dissertação tem como objetivo preencher estas lacunas ao aplicar paisagens de aptidão à neuroevolução, usando este conceito para inferir informação útil sobre a capacidade de aprendizagem e generalização deste método de aprendizagem automática. De forma a realizar esta tarefa, desenvolvemos e usámos um algoritmo de neuroevolução baseado em gramáticas que gera redes neuronais convolucionais, e estudámos a dinâmica de três operadores de mutação distintos usados para evoluir múltiplos aspetos das redes neuronais. De forma a caracterizar as paisagens de aptidão, estudámos a autocorrelação (autocorrelation), medida entrópica de rugosidade (entropic measure of ruggedness), nuvens de aptidão (fitness clouds), medidas de gradiente (gradient measures) e o coeficiente de declive negativo (negative slope coefficient), e ao mesmo tempo discutimos porque é que apesar de não usarmos outras medidas, tais como redes de ótimos locais (local óptima networks) e correlação aptidão distância (fitness distance correlation), estas podem providenciar resultados interessantes. Também propomos o uso de duas novas medidas de avaliação: nuvens de densidade, uma nova medida desenvolvida nesta tese com capacidade de dar informação visual sobre a distribuição de amostras, e a medida de sobreajustamento (overfitting), que é derivada de uma medida já existente e usada em programação genética. Os resultados demonstram que as medidas usadas são apropriadas e produzem resultados precisos no que toca a estimar tanto a capacidade de aprendizagem como a habilidade de generalização das configuração de neuroevolução consideradas.Fitness landscapes are a useful and widely investigated concept for studying the dynamics of meta-heuristics. In the last two decades, they have been successfully used for estimating the optimization capabilities of different flavors of evolutionary algorithms, including genetic algorithms and genetic programming. However, so far they have not been used for studying the performance of Machine Learning (ML) algorithms on unseen data, and they have not been applied to study neuroevolution landscapes. Coincidentally, despite having existed for almost three decades and still being a dynamic and rapidly growing research field, neuroevolution still lacks theoretical and methodological foundations, which could be provided by the application of fitness landscapes. This thesis aims to fill these gaps by applying fitness landscapes to neuroevolution, using this concept to infer useful information about the learning and generalization ability of the ML method. For this task, we developed and used a grammar-based neuroevolution approach to generate convolutional neural networks, and studied the dynamics of three different mutation operators used to evolve multiple aspects of the networks. To characterize fitness landscapes, we studied autocorrelation, entropic measure of ruggedness, fitness clouds, gradient measures and negative slope coefficient, while also discussing why other measures such as local optima networks and fitness distance correlation, despite not being used, could provide interesting results. Also, we propose the use of two additional evaluation measures: density clouds, a new measure developed in this thesis that can provide visual information regarding the distribution of samples, and overfitting measure, which is derived from a measure used in genetic programming. The results show that the used measures are appropriate and produce accurate results when estimating both the learning capability and the generalization ability of the considered neuroevolution configurations
Mixed integer programming and adaptive problem solver learned by landscape analysis for clinical laboratory scheduling
This paper attempts to derive a mathematical formulation for real-practice
clinical laboratory scheduling, and to present an adaptive problem solver by
leveraging landscape structures. After formulating scheduling of medical tests
as a distributed scheduling problem in heterogeneous, flexible job shop
environment, we establish a mixed integer programming model to minimize mean
test turnaround time. Preliminary landscape analysis sustains that these
clinics-orientated scheduling instances are difficult to solve. The search
difficulty motivates the design of an adaptive problem solver to reduce
repetitive algorithm-tuning work, but with a guaranteed convergence. Yet, under
a search strategy, relatedness from exploitation competence to landscape
topology is not transparent. Under strategies that impose different-magnitude
perturbations, we investigate changes in landscape structure and find that
disturbance amplitude, local-global optima connectivity, landscape's ruggedness
and plateau size fairly predict strategies' efficacy. Medium-size instances of
100 tasks are easier under smaller-perturbation strategies that lead to
smoother landscapes with smaller plateaus. For large-size instances of 200-500
tasks, extant strategies at hand, having either larger or smaller
perturbations, face more rugged landscapes with larger plateaus that impede
search. Our hypothesis that medium perturbations may generate smoother
landscapes with smaller plateaus drives our design of this new strategy and its
verification by experiments. Composite neighborhoods managed by meta-Lamarckian
learning show beyond average performance, implying reliability when prior
knowledge of landscape is unknown
Understanding Phase Transitions with Local Optima Networks: Number Partitioning as a Case Study
Phase transitions play an important role in understanding search difficulty in combinatorial optimisation. However, previous attempts have not revealed a clear link between fitness landscape properties and the phase transition. We explore whether the global landscape structure of the number partitioning problem changes with the phase transition. Using the local optima network model, we analyse a number of instances before, during, and after the phase transition. We compute relevant network and neutrality metrics; and importantly, identify and visualise the funnel structure with an approach (monotonic sequences) inspired by theoretical chemistry. While most metrics remain oblivious to the phase transition, our results reveal that the funnel structure clearly changes. Easy instances feature a single or a small number of dominant funnels leading to global optima; hard instances have a large number of suboptimal funnels attracting the search. Our study brings new insights and tools to the study of phase transitions in combinatorial optimisation
- …