6,828 research outputs found
Recommended from our members
Combinatorial optimization and metaheuristics
Today, combinatorial optimization is one of the youngest and most active areas of discrete mathematics. It is a branch of optimization in applied mathematics and computer science, related to operational research, algorithm theory and computational complexity theory. It sits at the intersection of several fields, including artificial intelligence, mathematics and software engineering. Its increasing interest arises for the fact that a large number of scientific and industrial problems can be formulated as abstract combinatorial optimization problems, through graphs and/or (integer) linear programs. Some of these problems have polynomial-time (“efficient”) algorithms, while most of them are NP-hard, i.e. it is not proved that they can be solved in polynomial-time. Mainly, it means that it is not possible to guarantee that an exact solution to the problem can be found and one has to settle for an approximate solution with known performance guarantees. Indeed, the goal of approximate methods is to find “quickly” (reasonable run-times), with “high” probability, provable “good” solutions (low error from the real optimal solution). In the last 20 years, a new kind of algorithm commonly called metaheuristics have emerged in this class, which basically try to combine heuristics in high level frameworks aimed at efficiently and effectively exploring the search space. This report briefly outlines the components, concepts, advantages and disadvantages of different metaheuristic approaches from a conceptual point of view, in order to analyze their similarities and differences. The two very significant forces of intensification and diversification, that mainly determine the behavior of a metaheuristic, will be pointed out. The report concludes by exploring the importance of hybridization and integration methods
Runtime Analysis of the Genetic Algorithm on Random Satisfiable 3-CNF Formulas
The genetic algorithm, first proposed at GECCO 2013,
showed a surprisingly good performance on so me optimization problems. The
theoretical analysis so far was restricted to the OneMax test function, where
this GA profited from the perfect fitness-distance correlation. In this work,
we conduct a rigorous runtime analysis of this GA on random 3-SAT instances in
the planted solution model having at least logarithmic average degree, which
are known to have a weaker fitness distance correlation.
We prove that this GA with fixed not too large population size again obtains
runtimes better than , which is a lower bound for most
evolutionary algorithms on pseudo-Boolean problems with unique optimum.
However, the self-adjusting version of the GA risks reaching population sizes
at which the intermediate selection of the GA, due to the weaker
fitness-distance correlation, is not able to distinguish a profitable offspring
from others. We show that this problem can be overcome by equipping the
self-adjusting GA with an upper limit for the population size. Apart from
sparse instances, this limit can be chosen in a way that the asymptotic
performance does not worsen compared to the idealistic OneMax case. Overall,
this work shows that the GA can provably have a good
performance on combinatorial search and optimization problems also in the
presence of a weaker fitness-distance correlation.Comment: An extended abstract of this report will appear in the proceedings of
the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017
Discrete Algorithms for Analysis of Genotype Data
Accessibility of high-throughput genotyping technology makes possible genome-wide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. The optimization formulations for searching disease-associated risk/resistant factors and predicting disease susceptibility for given case-control study have been introduced. Several discrete methods for disease association search exploiting greedy strategy and topological properties of case-control studies have been developed. New disease susceptibility prediction methods based on the developed search methods have been validated on datasets from case-control studies for several common diseases. Our experiments compare favorably the proposed algorithms with the existing association search and susceptibility prediction methods
Unbiased Black-Box Complexities of Jump Functions
We analyze the unbiased black-box complexity of jump functions with small,
medium, and large sizes of the fitness plateau surrounding the optimal
solution.
Among other results, we show that when the jump size is , that is, only a small constant fraction of the fitness values
is visible, then the unbiased black-box complexities for arities and higher
are of the same order as those for the simple \textsc{OneMax} function. Even
for the extreme jump function, in which all but the two fitness values
and are blanked out, polynomial-time mutation-based (i.e., unary unbiased)
black-box optimization algorithms exist. This is quite surprising given that
for the extreme jump function almost the whole search space (all but a
fraction) is a plateau of constant fitness.
To prove these results, we introduce new tools for the analysis of unbiased
black-box complexities, for example, selecting the new parent individual not by
comparing the fitnesses of the competing search points, but also by taking into
account the (empirical) expected fitnesses of their offspring.Comment: This paper is based on results presented in the conference versions
[GECCO 2011] and [GECCO 2014
Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation
Representing patterns as labeled graphs is becoming increasingly common in
the broad field of computational intelligence. Accordingly, a wide repertoire
of pattern recognition tools, such as classifiers and knowledge discovery
procedures, are nowadays available and tested for various datasets of labeled
graphs. However, the design of effective learning procedures operating in the
space of labeled graphs is still a challenging problem, especially from the
computational complexity viewpoint. In this paper, we present a major
improvement of a general-purpose classifier for graphs, which is conceived on
an interplay between dissimilarity representation, clustering,
information-theoretic techniques, and evolutionary optimization algorithms. The
improvement focuses on a specific key subroutine devised to compress the input
data. We prove different theorems which are fundamental to the setting of the
parameters controlling such a compression operation. We demonstrate the
effectiveness of the resulting classifier by benchmarking the developed
variants on well-known datasets of labeled graphs, considering as distinct
performance indicators the classification accuracy, computing time, and
parsimony in terms of structural complexity of the synthesized classification
models. The results show state-of-the-art standards in terms of test set
accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio
- …