82 research outputs found

    Noisy combinatorial optimisation with evolutionary algorithms

    Get PDF
    The determination of the efficient evolutionary optimisation approaches in solving noisy combinatorial problems is the main focus in this research. Initially, we present an empirical study of a range of evolutionary algorithms applied to various noisy combinatorial optimisation problems. There are four sets of experiments. The first looks at several toy problems, such as OneMax and other linear problems. We find that Univariate Marginal Distribution Algorithm (UMDA) and the Paired-Crossover Evolutionary Algorithm (PCEA) are the only ones able to cope robustly with noise, within a reasonable fixed time budget. In the second stage, UMDA and PCEA are then tested on more complex noisy problems: SubsetSum, Knapsack and SetCover. Both perform well under increasing levels of noise, with UMDA being the better of the two. In the third stage, we consider two noisy multi-objective problems (CountingOnesCountingZeros and a multi-objective formulation of SetCover). We compare several adaptations of UMDA for multi-objective problems with the Simple Evolutionary Multi-objective Optimiser (SEMO) and NSGA-II. In the last stage of empirical analysis, a realistic problem of the path planning for the ground surveillance with Unmanned Aerial Vehicles is considered. We conclude that UMDA, and its variants, can be highly effective on a variety of noisy combinatorial optimisation, outperforming many other evolutionary algorithms. Next, we study the use of voting mechanisms in populations, and introduce a new Voting algorithm which can solve OneMax and Jump in O(n log n), even for gaps as large as O(n). More significantly, the algorithm solves OneMax with added posterior noise in O(n log n), when the variance of the noise distribution is sigma2^2 = O(n) and in O(sigma2^2 log n) when the noise variance is greater than this. We assume only that the noise distribution has finite mean and variance and (for the larger noise case) that it is unimodal. Building upon this promising performance, we consider other noise models prevalent in optimisation and learning and show that the Voting algorithm has efficient performance in solving OneMax in presence of these noise variants. We also examine the performance on arbitrary linear and monotonic functions. The Voting algorithm fails on LeadingOnes but we give a variant which can solve the problem in O(n log n). We empirically study the use of voting in population based algorithms (UMDA, PCEA and cGA) and show that this can be effective for large population sizes

    Approximation Algorithms for Resource Allocation

    Get PDF
    This thesis is devoted to designing new techniques and algorithms for combinatorial optimization problems arising in various applications of resource allocation. Resource allocation refers to a class of problems where scarce resources must be distributed among competing agents maintaining certain optimization criteria. Examples include scheduling jobs on one/multiple machines maintaining system performance; assigning advertisements to bidders, or items to people maximizing profit/social fairness; allocating servers or channels satisfying networking requirements etc. Altogether they comprise a wide variety of combinatorial optimization problems. However, a majority of these problems are NP-hard in nature and therefore, the goal herein is to develop approximation algorithms that approximate the optimal solution as best as possible in polynomial time. The thesis addresses two main directions. First, we develop several new techniques, predominantly, a new linear programming rounding methodology and a constructive aspect of a well-known probabilistic method, the Lov\'{a}sz Local Lemma (LLL). Second, we employ these techniques to applications of resource allocation obtaining substantial improvements over known results. Our research also spurs new direction of study; we introduce new models for achieving energy efficiency in scheduling and a novel framework for assigning advertisements in cellular networks. Both of these lead to a variety of interesting questions. Our linear programming rounding methodology is a significant generalization of two major rounding approaches in the theory of approximation algorithms, namely the dependent rounding and the iterative relaxation procedure. Our constructive version of LLL leads to first algorithmic results for many combinatorial problems. In addition, it settles a major open question of obtaining a constant factor approximation algorithm for the Santa Claus problem. The Santa Claus problem is a NPNP-hard resource allocation problem that received much attention in the last several years. Through out this thesis, we study a number of applications related to scheduling jobs on unrelated parallel machines, such as provisionally shutting down machines to save energy, selectively dropping outliers to improve system performance, handling machines with hard capacity bounds on the number of jobs they can process etc. Hard capacity constraints arise naturally in many other applications and often render a hitherto simple combinatorial optimization problem difficult. In this thesis, we encounter many such instances of hard capacity constraints, namely in budgeted allocation of advertisements for cellular networks, overlay network design, and in classical problems like vertex cover, set cover and k-median

    Co-evolutionary Hybrid Bi-level Optimization

    Get PDF
    Multi-level optimization stems from the need to tackle complex problems involving multiple decision makers. Two-level optimization, referred as ``Bi-level optimization'', occurs when two decision makers only control part of the decision variables but impact each other (e.g., objective value, feasibility). Bi-level problems are sequential by nature and can be represented as nested optimization problems in which one problem (the ``upper-level'') is constrained by another one (the ``lower-level''). The nested structure is a real obstacle that can be highly time consuming when the lower-level is NP−hard\mathcal{NP}-hard. Consequently, classical nested optimization should be avoided. Some surrogate-based approaches have been proposed to approximate the lower-level objective value function (or variables) to reduce the number of times the lower-level is globally optimized. Unfortunately, such a methodology is not applicable for large-scale and combinatorial bi-level problems. After a deep study of theoretical properties and a survey of the existing applications being bi-level by nature, problems which can benefit from a bi-level reformulation are investigated. A first contribution of this work has been to propose a novel bi-level clustering approach. Extending the well-know ``uncapacitated k-median problem'', it has been shown that clustering can be easily modeled as a two-level optimization problem using decomposition techniques. The resulting two-level problem is then turned into a bi-level problem offering the possibility to combine distance metrics in a hierarchical manner. The novel bi-level clustering problem has a very interesting property that enable us to tackle it with classical nested approaches. Indeed, its lower-level problem can be solved in polynomial time. In cooperation with the Luxembourg Centre for Systems Biomedicine (LCSB), this new clustering model has been applied on real datasets such as disease maps (e.g. Parkinson, Alzheimer). Using a novel hybrid and parallel genetic algorithm as optimization approach, the results obtained after a campaign of experiments have the ability to produce new knowledge compared to classical clustering techniques combining distance metrics in a classical manner. The previous bi-level clustering model has the advantage that the lower-level can be solved in polynomial time although the global problem is by definition NP\mathcal{NP}-hard. Therefore, next investigations have been undertaken to tackle more general bi-level problems in which the lower-level problem does not present any specific advantageous properties. Since the lower-level problem can be very expensive to solve, the focus has been turned to surrogate-based approaches and hyper-parameter optimization techniques with the aim of approximating the lower-level problem and reduce the number of global lower-level optimizations. Adapting the well-know bayesian optimization algorithm to solve general bi-level problems, the expensive lower-level optimizations have been dramatically reduced while obtaining very accurate solutions. The resulting solutions and the number of spared lower-level optimizations have been compared to the bi-level evolutionary algorithm based on quadratic approximations (BLEAQ) results after a campaign of experiments on official bi-level benchmarks. Although both approaches are very accurate, the bi-level bayesian version required less lower-level objective function calls. Surrogate-based approaches are restricted to small-scale and continuous bi-level problems although many real applications are combinatorial by nature. As for continuous problems, a study has been performed to apply some machine learning strategies. Instead of approximating the lower-level solution value, new approximation algorithms for the discrete/combinatorial case have been designed. Using the principle employed in GP hyper-heuristics, heuristics are trained in order to tackle efficiently the NP−hard\mathcal{NP}-hard lower-level of bi-level problems. This automatic generation of heuristics permits to break the nested structure into two separated phases: \emph{training lower-level heuristics} and \emph{solving the upper-level problem with the new heuristics}. At this occasion, a second modeling contribution has been introduced through a novel large-scale and mixed-integer bi-level problem dealing with pricing in the cloud, i.e., the Bi-level Cloud Pricing Optimization Problem (BCPOP). After a series of experiments that consisted in training heuristics on various lower-level instances of the BCPOP and using them to tackle the bi-level problem itself, the obtained results are compared to the ``cooperative coevolutionary algorithm for bi-level optimization'' (COBRA). Although training heuristics enables to \emph{break the nested structure}, a two phase optimization is still required. Therefore, the emphasis has been put on training heuristics while optimizing the upper-level problem using competitive co-evolution. Instead of adopting the classical decomposition scheme as done by COBRA which suffers from the strong epistatic links between lower-level and upper-level variables, co-evolving the solution and the mean to get to it can cope with these epistatic link issues. The ``CARBON'' algorithm developed in this thesis is a competitive and hybrid co-evolutionary algorithm designed for this purpose. In order to validate the potential of CARBON, numerical experiments have been designed and results have been compared to state-of-the-art algorithms. These results demonstrate that ``CARBON'' makes possible to address nested optimization efficiently

    Gradient boosting in automatic machine learning: feature selection and hyperparameter optimization

    Get PDF
    Das Ziel des automatischen maschinellen Lernens (AutoML) ist es, alle Aspekte der Modellwahl in prädiktiver Modellierung zu automatisieren. Diese Arbeit beschäftigt sich mit Gradienten Boosting im Kontext von AutoML mit einem Fokus auf Gradient Tree Boosting und komponentenweisem Boosting. Beide Techniken haben eine gemeinsame Methodik, aber ihre Zielsetzung ist unterschiedlich. Während Gradient Tree Boosting im maschinellen Lernen als leistungsfähiger Vorhersagealgorithmus weit verbreitet ist, wurde komponentenweises Boosting im Rahmen der Modellierung hochdimensionaler Daten entwickelt. Erweiterungen des komponentenweisen Boostings auf multidimensionale Vorhersagefunktionen werden in dieser Arbeit ebenfalls untersucht. Die Herausforderung der Hyperparameteroptimierung wird mit Fokus auf Bayesianische Optimierung und effiziente Stopping-Strategien diskutiert. Ein groß angelegter Benchmark über Hyperparameter verschiedener Lernalgorithmen, zeigt den kritischen Einfluss von Hyperparameter Konfigurationen auf die Qualität der Modelle. Diese Daten können als Grundlage für neue AutoML- und Meta-Lernansätze verwendet werden. Darüber hinaus werden fortgeschrittene Strategien zur Variablenselektion zusammengefasst und eine neue Methode auf Basis von permutierten Variablen vorgestellt. Schließlich wird ein AutoML-Ansatz vorgeschlagen, der auf den Ergebnissen und Best Practices für die Variablenselektion und Hyperparameteroptimierung basiert. Ziel ist es AutoML zu vereinfachen und zu stabilisieren sowie eine hohe Vorhersagegenauigkeit zu gewährleisten. Dieser Ansatz wird mit AutoML-Methoden, die wesentlich komplexere Suchräume und Ensembling Techniken besitzen, verglichen. Vier Softwarepakete für die statistische Programmiersprache R sind Teil dieser Arbeit, die neu entwickelt oder erweitert wurden: mlrMBO: Ein generisches Paket für die Bayesianische Optimierung; autoxgboost: Ein AutoML System, das sich vollständig auf Gradient Tree Boosting fokusiert; compboost: Ein modulares, in C++ geschriebenes Framework für komponentenweises Boosting; gamboostLSS: Ein Framework für komponentenweises Boosting additiver Modelle für Location, Scale und Shape.The goal of automatic machine learning (AutoML) is to automate all aspects of model selection in (supervised) predictive modeling. This thesis deals with gradient boosting techniques in the context of AutoML with a focus on gradient tree boosting and component-wise gradient boosting. Both techniques have a common methodology, but their goal is quite different. While gradient tree boosting is widely used in machine learning as a powerful prediction algorithm, component-wise gradient boosting strength is in feature selection and modeling of high-dimensional data. Extensions of component-wise gradient boosting to multidimensional prediction functions are considered as well. Focusing on Bayesian optimization and efficient early stopping strategies the challenge of hyperparameter optimization for these algorithms is discussed. Difficulty in the optimization of these algorithms is shown by a large scale random search on hyperparameters for machine learning algorithms, that can build the foundation of new AutoML and metalearning approaches. Furthermore, advanced feature selection strategies are summarized and a new method based on shadow features is introduced. Finally, an AutoML approach based on the results and best practices for feature selection and hyperparameter optimization is proposed, with the goal of simplifying and stabilizing AutoML while maintaining high prediction accuracy. This is compared to AutoML approaches using much more complex search spaces and ensembling techniques. Four software packages for the statistical programming language R have been newly developed or extended as a part of this thesis: mlrMBO: A general framework for Bayesian optimization; autoxgboost: An automatic machine learning framework that heavily utilizes gradient tree boosting; compboost: A modular framework for component-wise boosting written in C++; gamboostLSS: A framework for component-wise boosting for generalized additive models for location scale and shape

    Estudio de métodos de construcción de ensembles de clasificadores y aplicaciones

    Get PDF
    La inteligencia artificial se dedica a la creación de sistemas informáticos con un comportamiento inteligente. Dentro de este área el aprendizaje computacional estudia la creación de sistemas que aprenden por sí mismos. Un tipo de aprendizaje computacional es el aprendizaje supervisado, en el cual, se le proporcionan al sistema tanto las entradas como la salida esperada y el sistema aprende a partir de estos datos. Un sistema de este tipo se denomina clasificador. En ocasiones ocurre, que en el conjunto de ejemplos que utiliza el sistema para aprender, el número de ejemplos de un tipo es mucho mayor que el número de ejemplos de otro tipo. Cuando esto ocurre se habla de conjuntos desequilibrados. La combinación de varios clasificadores es lo que se denomina "ensemble", y a menudo ofrece mejores resultados que cualquiera de los miembros que lo forman. Una de las claves para el buen funcionamiento de los ensembles es la diversidad. Esta tesis, se centra en el desarrollo de nuevos algoritmos de construcción de ensembles, centrados en técnicas de incremento de la diversidad y en los problemas desequilibrados. Adicionalmente, se aplican estas técnicas a la solución de varias problemas industriales.Ministerio de Economía y Competitividad, proyecto TIN-2011-2404

    A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications

    Get PDF
    Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms

    Classification of the Existing Knowledge Base of OR/MS Research and Practice (1990-2019) using a Proposed Classification Scheme

    Get PDF
    This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this recordOperations Research/Management Science (OR/MS) has traditionally been defined as the discipline that applies advanced analytical methods to help make better and more informed decisions. The purpose of this paper is to present an analysis of the existing knowledge base of OR/MS research and practice using a proposed keywords-based approach. A conceptual structure is necessary in order to place in context the findings of our keyword analysis. Towards this we first present a classification scheme that relies on keywords that appeared in articles published in important OR/MS journals from 1990-2019 (over 82,000 articles). Our classification scheme applies a methodological approach towards keyword selection and its systematic classification, wherein approximately 1300 most frequently used keywords (in terms of cumulative percentage, these keywords and their derivations account for more than 45% of the approx. 290,000 keyword occurrences used by the authors to represent the content of their articles) were selected and organised in a classification scheme with seven top-level categories and multiple levels of sub-categories. The scheme identified the most commonly used keywords relating to OR/MS problems, modeling techniques and applications. Next, we use this proposed scheme to present an analysis of the last 30 years, in three distinct time periods, to show the changes in OR/MS literature. The contribution of the paper is thus twofold, (a) the development of a proposed discipline-based classification of keywords (like the ACM Computer Classification System and the AMS Mathematics Subject Classification), and (b) an analysis of OR/MS research and practice using the proposed classification

    Proceedings of the 2009 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    The joint workshop of the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Karlsruhe, and the Vision and Fusion Laboratory (Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT)), is organized annually since 2005 with the aim to report on the latest research and development findings of the doctoral students of both institutions. This book provides a collection of 16 technical reports on the research results presented on the 2009 workshop
    • …
    corecore