6,828 research outputs found

    Runtime Analysis of the (1+(λ,λ))(1+(\lambda,\lambda)) Genetic Algorithm on Random Satisfiable 3-CNF Formulas

    Full text link
    The (1+(λ,λ))(1+(\lambda,\lambda)) genetic algorithm, first proposed at GECCO 2013, showed a surprisingly good performance on so me optimization problems. The theoretical analysis so far was restricted to the OneMax test function, where this GA profited from the perfect fitness-distance correlation. In this work, we conduct a rigorous runtime analysis of this GA on random 3-SAT instances in the planted solution model having at least logarithmic average degree, which are known to have a weaker fitness distance correlation. We prove that this GA with fixed not too large population size again obtains runtimes better than Θ(nlogn)\Theta(n \log n), which is a lower bound for most evolutionary algorithms on pseudo-Boolean problems with unique optimum. However, the self-adjusting version of the GA risks reaching population sizes at which the intermediate selection of the GA, due to the weaker fitness-distance correlation, is not able to distinguish a profitable offspring from others. We show that this problem can be overcome by equipping the self-adjusting GA with an upper limit for the population size. Apart from sparse instances, this limit can be chosen in a way that the asymptotic performance does not worsen compared to the idealistic OneMax case. Overall, this work shows that the (1+(λ,λ))(1+(\lambda,\lambda)) GA can provably have a good performance on combinatorial search and optimization problems also in the presence of a weaker fitness-distance correlation.Comment: An extended abstract of this report will appear in the proceedings of the 2017 Genetic and Evolutionary Computation Conference (GECCO 2017

    Discrete Algorithms for Analysis of Genotype Data

    Get PDF
    Accessibility of high-throughput genotyping technology makes possible genome-wide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. The optimization formulations for searching disease-associated risk/resistant factors and predicting disease susceptibility for given case-control study have been introduced. Several discrete methods for disease association search exploiting greedy strategy and topological properties of case-control studies have been developed. New disease susceptibility prediction methods based on the developed search methods have been validated on datasets from case-control studies for several common diseases. Our experiments compare favorably the proposed algorithms with the existing association search and susceptibility prediction methods

    Unbiased Black-Box Complexities of Jump Functions

    Full text link
    We analyze the unbiased black-box complexity of jump functions with small, medium, and large sizes of the fitness plateau surrounding the optimal solution. Among other results, we show that when the jump size is (1/2ε)n(1/2 - \varepsilon)n, that is, only a small constant fraction of the fitness values is visible, then the unbiased black-box complexities for arities 33 and higher are of the same order as those for the simple \textsc{OneMax} function. Even for the extreme jump function, in which all but the two fitness values n/2n/2 and nn are blanked out, polynomial-time mutation-based (i.e., unary unbiased) black-box optimization algorithms exist. This is quite surprising given that for the extreme jump function almost the whole search space (all but a Θ(n1/2)\Theta(n^{-1/2}) fraction) is a plateau of constant fitness. To prove these results, we introduce new tools for the analysis of unbiased black-box complexities, for example, selecting the new parent individual not by comparing the fitnesses of the competing search points, but also by taking into account the (empirical) expected fitnesses of their offspring.Comment: This paper is based on results presented in the conference versions [GECCO 2011] and [GECCO 2014

    Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation

    Full text link
    Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various datasets of labeled graphs. However, the design of effective learning procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay between dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization algorithms. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of the parameters controlling such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, computing time, and parsimony in terms of structural complexity of the synthesized classification models. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio
    corecore