1,870 research outputs found

    Learning to Race through Coordinate Descent Bayesian Optimisation

    Full text link
    In the automation of many kinds of processes, the observable outcome can often be described as the combined effect of an entire sequence of actions, or controls, applied throughout its execution. In these cases, strategies to optimise control policies for individual stages of the process might not be applicable, and instead the whole policy might have to be optimised at once. On the other hand, the cost to evaluate the policy's performance might also be high, being desirable that a solution can be found with as few interactions as possible with the real system. We consider the problem of optimising control policies to allow a robot to complete a given race track within a minimum amount of time. We assume that the robot has no prior information about the track or its own dynamical model, just an initial valid driving example. Localisation is only applied to monitor the robot and to provide an indication of its position along the track's centre axis. We propose a method for finding a policy that minimises the time per lap while keeping the vehicle on the track using a Bayesian optimisation (BO) approach over a reproducing kernel Hilbert space. We apply an algorithm to search more efficiently over high-dimensional policy-parameter spaces with BO, by iterating over each dimension individually, in a sequential coordinate descent-like scheme. Experiments demonstrate the performance of the algorithm against other methods in a simulated car racing environment.Comment: Accepted as conference paper for the 2018 IEEE International Conference on Robotics and Automation (ICRA

    Quantifying the Impact of Parameter Tuning on Nature-Inspired Algorithms

    Full text link
    The problem of parameterization is often central to the effective deployment of nature-inspired algorithms. However, finding the optimal set of parameter values for a combination of problem instance and solution method is highly challenging, and few concrete guidelines exist on how and when such tuning may be performed. Previous work tends to either focus on a specific algorithm or use benchmark problems, and both of these restrictions limit the applicability of any findings. Here, we examine a number of different algorithms, and study them in a "problem agnostic" fashion (i.e., one that is not tied to specific instances) by considering their performance on fitness landscapes with varying characteristics. Using this approach, we make a number of observations on which algorithms may (or may not) benefit from tuning, and in which specific circumstances.Comment: 8 pages, 7 figures. Accepted at the European Conference on Artificial Life (ECAL) 2013, Taormina, Ital

    A new sequential covering strategy for inducing classification rules with ant colony algorithms

    Get PDF
    Ant colony optimization (ACO) algorithms have been successfully applied to discover a list of classification rules. In general, these algorithms follow a sequential covering strategy, where a single rule is discovered at each iteration of the algorithm in order to build a list of rules. The sequential covering strategy has the drawback of not coping with the problem of rule interaction, i.e., the outcome of a rule affects the rules that can be discovered subsequently since the search space is modified due to the removal of examples covered by previous rules. This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules. Our experiments using 18 publicly available data sets show that the predictive accuracy obtained by a new ACO classification algorithm implementing the proposed sequential covering strategy is statistically significantly higher than the predictive accuracy of state-of-the-art rule induction classification algorithms

    Heuristic generation via parameter tuning for online bin packing

    Get PDF
    Online bin packing requires immediate decisions to be made for placing an incoming item one at a time into bins of fixed capacity without causing any overflow. The goal is to maximise the average bin fullness after placement of a long stream of items. A recent work describes an approach for solving this problem based on a ‘policy matrix’ representation in which each decision option is independently given a value and the highest value option is selected. A policy matrix can also be viewed as a heuristic with many parameters and then the search for a good policy matrix can be treated as a parameter tuning process. In this study, we show that the Irace parameter tuning algorithm produces heuristics which outperform the standard human designed heuristics for various instances of the online bin packing problem

    Hybridization of Evolutionary Operators with Elitist Iterated Racing for the Simulation Optimization of Traffic Lights Programs.

    Get PDF
    In the traffic light scheduling problem, the evaluation of candidate solutions requires the simulation of a process under various (traffic) scenarios. Thus, good solutions should not only achieve good objective function values, but they must be robust (low variance) across all different scenarios. Previous work has shown that combining IRACE with evolutionary operators is effective for this task due to the power of evolutionary operators in numerical optimization. In this article, we further explore the hybridization of evolutionary operators and the elitist iterated racing of IRACE for the simulation–optimization of traffic light programs. We review previous works from the literature to find the evolutionary operators performing the best when facing this problem to propose new hybrid algorithms. We evaluate our approach over a realistic case study derived from the traffic network of Málaga (Spain) with 275 traffic lights that should be scheduled optimally. The experimental analysis reveals that the hybrid algorithm comprising IRACE plus differential evolution offers statistically better results than the other algorithms when the budget of simulations is low. In contrast, IRACE performs better than the hybrids for a high simulations budget, although the optimization time is much longer.This research was partially funded by the University of Malaga, Andaluc ´ ´ıa Tech and the project TAILOR Grant #952215, H2020-ICT-2019-3. C. Cintrano is supported by a FPI grant (BES-2015-074805) from Spanish MINECO. M. Lopez-Ib ´ a´nez is a ˜ “Beatriz Galindo” Senior Distinguished Researcher (BEAGAL 18/00053) funded by the Ministry of Science and Innovation of the Spanish Government. J. Ferrer is supported by a postdoc grant (DOC/00488) funded by the Andalusian Ministry of Economic Transformation, Industry, Knowledge and Universities

    From Parameter Tuning to Dynamic Heuristic Selection

    Get PDF
    The importance of balance between exploration and exploitation plays a crucial role while solving combinatorial optimization problems. This balance is reached by two general techniques: by using an appropriate problem solver and by setting its proper parameters. Both problems were widely studied in the past and the research process continues up until now. The latest studies in the field of automated machine learning propose merging both problems, solving them at design time, and later strengthening the results at runtime. To the best of our knowledge, the generalized approach for solving the parameter setting problem in heuristic solvers has not yet been proposed. Therefore, the concept of merging heuristic selection and parameter control have not been introduced. In this thesis, we propose an approach for generic parameter control in meta-heuristics by means of reinforcement learning (RL). Making a step further, we suggest a technique for merging the heuristic selection and parameter control problems and solving them at runtime using RL-based hyper-heuristic. The evaluation of the proposed parameter control technique on a symmetric traveling salesman problem (TSP) revealed its applicability by reaching the performance of tuned in online and used in isolation underlying meta-heuristic. Our approach provides the results on par with the best underlying heuristics with tuned parameters.:1 Introduction 1 1.1 Motivation 1 1.2 Research objective 2 1.3 Solution overview 2 2 Background and RelatedWork Analysis 3 2.1 Optimization Problems and their Solvers 3 2.2 Heuristic Solvers for Optimization Problems 9 2.3 Setting Algorithm Parameters 19 2.4 Combined Algorithm Selection and Hyper-Parameter Tuning Problem 27 2.5 Conclusion on Background and Related Work Analysis 28 3 Online Selection Hyper-Heuristic with Generic Parameter Control 31 3.1 Combined Parameter Control and Algorithm Selection Problem 31 3.2 Search Space Structure 32 3.3 Parameter Prediction Process 34 3.4 Low-Level Heuristics 35 3.5 Conclusion of Concept 36 4 Implementation Details 37 4.2 Search Space 40 4.3 Prediction Process 43 4.4 Low Level Heuristics 48 4.5 Conclusion 52 5 Evaluation 55 5.1 Optimization Problem 55 5.2 Environment Setup 56 5.3 Meta-heuristics Tuning 56 5.4 Concept Evaluation 60 5.5 Analysis of HH-PC Settings 74 5.6 Conclusion 79 6 Conclusion 81 7 FutureWork 83 7.1 Prediction Process 83 7.2 Search Space 84 7.3 Evaluations and Benchmarks 84 Bibliography 87 A Evaluation Results 99 A.1 Results in Figures 99 A.2 Results in numbers 10
    corecore