71 research outputs found

    Feature Grouping-based Feature Selection

    Get PDF

    Complexity Theory for Discrete Black-Box Optimization Heuristics

    Full text link
    A predominant topic in the theory of evolutionary algorithms and, more generally, theory of randomized black-box optimization techniques is running time analysis. Running time analysis aims at understanding the performance of a given heuristic on a given problem by bounding the number of function evaluations that are needed by the heuristic to identify a solution of a desired quality. As in general algorithms theory, this running time perspective is most useful when it is complemented by a meaningful complexity theory that studies the limits of algorithmic solutions. In the context of discrete black-box optimization, several black-box complexity models have been developed to analyze the best possible performance that a black-box optimization algorithm can achieve on a given problem. The models differ in the classes of algorithms to which these lower bounds apply. This way, black-box complexity contributes to a better understanding of how certain algorithmic choices (such as the amount of memory used by a heuristic, its selective pressure, or properties of the strategies that it uses to create new solution candidates) influences performance. In this chapter we review the different black-box complexity models that have been proposed in the literature, survey the bounds that have been obtained for these models, and discuss how the interplay of running time analysis and black-box complexity can inspire new algorithmic solutions to well-researched problems in evolutionary computation. We also discuss in this chapter several interesting open questions for future work.Comment: This survey article is to appear (in a slightly modified form) in the book "Theory of Randomized Search Heuristics in Discrete Search Spaces", which will be published by Springer in 2018. The book is edited by Benjamin Doerr and Frank Neumann. Missing numbers of pointers to other chapters of this book will be added as soon as possibl

    Annual Report of Undergraduate Research Fellows, August 2011 to May 2012

    Get PDF
    Annual Report of Undergraduate Research Fellows from August 2011 to May 2012

    Text Similarity Between Concepts Extracted from Source Code and Documentation

    Get PDF
    Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p

    Towards hybrid methods for solving hard combinatorial optimization problems

    Full text link
    Tesis doctoral leída en la Escuela Politécnica Superior de la Universidad Autónoma de Madrid el 4 de septiembre de 200

    Estimation of distribution algorithms in logistics : Analysis, design, and application

    Get PDF
    This thesis considers the analysis, design and application of Estimation of Distribution Algorithms (EDA) in Logistics. It approaches continouos nonlinear optimization problems (standard test problems and stochastic transportation problems) as well as location problems, strategic safety stock placement problems and lotsizing problems. The thesis adds to the existing literature by proposing theoretical advances for continuous EDAs and practical applications of discrete EDAs. Thus, it should be of interest for researchers from evolutionary computation, as well as practitioners that are in need of efficient algorithms for the above mentioned problems

    An Evolutionary Multi-Objective Optimization Framework for Bi-level Problems

    Get PDF
    Genetic algorithms (GA) are stochastic optimization methods inspired by the evolutionist theory on the origin of species and natural selection. They are able to achieve good exploration of the solution space and accurate convergence toward the global optimal solution. GAs are highly modular and easily adaptable to specific real-world problems which makes them one of the most efficient available numerical optimization methods. This work presents an optimization framework based on the Multi-Objective Genetic Algorithm for Structured Inputs (MOGASI) which combines modules and operators with specialized routines aimed at achieving enhanced performance on specific types of problems. MOGASI has dedicated methods for handling various types of data structures present in an optimization problem as well as a pre-processing phase aimed at restricting the problem domain and reducing problem complexity. It has been extensively tested against a set of benchmarks well-known in literature and compared to a selection of state-of-the-art GAs. Furthermore, the algorithm framework was extended and adapted to be applied to Bi-level Programming Problems (BPP). These are hierarchical optimization problems where the optimal solution of the bottom-level constitutes part of the top-level constraints. One of the most promising methods for handling BPPs with metaheuristics is the so-called "nested" approach. A framework extension is performed to support this kind of approach. This strategy and its effectiveness are shown on two real-world BPPs, both falling in the category of pricing problems. The first application is the Network Pricing Problem (NPP) that concerns the setting of road network tolls by an authority that tries to maximize its profit whereas users traveling on the network try to minimize their costs. A set of instances is generated to compare the optimization results of an exact solver with the MOGASI bi-level nested approach and identify the problem sizes where the latter performs best. The second application is the Peak-load Pricing (PLP) Problem. The PLP problem is aimed at investigating the possibilities for mitigating European air traffic congestion. The PLP problem is reformulated as a multi-objective BPP and solved with the MOGASI nested approach. The target is to modulate charges imposed on airspace users so as to redistribute air traffic at the European level. A large scale instance based on real air traffic data on the entire European airspace is solved. Results show that significant improvements in traffic distribution in terms of both schedule displacement and air space sector load can be achieved through this simple, en-route charge modulation scheme

    Computational complexity of evolutionary algorithms, hybridizations, and swarm intelligence

    Get PDF
    Bio-inspired randomized search heuristics such as evolutionary algorithms, hybridizations with local search, and swarm intelligence are very popular among practitioners as they can be applied in case the problem is not well understood or when there is not enough knowledge, time, or expertise to design problem-specific algorithms. Evolutionary algorithms simulate the natural evolution of species by iteratively applying evolutionary operators such as mutation, recombination, and selection to a set of solutions for a given problem. A recent trend is to hybridize evolutionary algorithms with local search to refine newly constructed solutions by hill climbing. Swarm intelligence comprises ant colony optimization as well as particle swarm optimization. These modern search paradigms rely on the collective intelligence of many single agents to find good solutions for the problem at hand. Many empirical studies demonstrate the usefulness of these heuristics for a large variety of problems, but a thorough understanding is still far away. We regard these algorithms from the perspective of theoretical computer science and analyze the random time these heuristics need to optimize pseudo-Boolean problems. This is done in a mathematically rigorous sense, using tools known from the analysis of randomized algorithms, and it leads to asymptotic bounds on their computational complexity. This approach has been followed successfully for evolutionary algorithms, but the theory of hybrid algorithms and swarm intelligence is still in its very infancy. Our results shed light on the asymptotic performance of these heuristics, increase our understanding of their dynamic behavior, and contribute to a rigorous theoretical foundation of randomized search heuristics

    A genetic programming hyper-heuristic approach to automated packing

    Get PDF
    This thesis presents a programme of research which investigated a genetic programming hyper-heuristic methodology to automate the heuristic design process for one, two and three dimensional packing problems. Traditionally, heuristic search methodologies operate on a space of potential solutions to a problem. In contrast, a hyper-heuristic is a heuristic which searches a space of heuristics, rather than a solution space directly. The majority of hyper-heuristic research papers, so far, have involved selecting a heuristic, or sequence of heuristics, from a set pre-defined by the practitioner. Less well studied are hyper-heuristics which can create new heuristics, from a set of potential components. This thesis presents a genetic programming hyper-heuristic which makes it possible to automatically generate heuristics for a wide variety of packing problems. The genetic programming algorithm creates heuristics by intelligently combining components. The evolved heuristics are shown to be highly competitive with human created heuristics. The methodology is first applied to one dimensional bin packing, where the evolved heuristics are analysed to determine their quality, specialisation, robustness, and scalability. Importantly, it is shown that these heuristics are able to be reused on unseen problems. The methodology is then applied to the two dimensional packing problem to determine if automatic heuristic generation is possible for this domain. The three dimensional bin packing and knapsack problems are then addressed. It is shown that the genetic programming hyper-heuristic methodology can evolve human competitive heuristics, for the one, two, and three dimensional cases of both of these problems. No change of parameters or code is required between runs. This represents the first packing algorithm in the literature able to claim human competitive results in such a wide variety of packing domains
    corecore