9 research outputs found

    An apprenticeship learning hyper-heuristic for vehicle routing in HyFlex

    Get PDF
    Apprenticeship learning occurs via observations while an expert is in action. A hyper-heuristic is a search method or a learning mechanism that controls a set of low level heuristics or combines different heuristic components to generate heuristics for solving a given computationally hard problem. In this study, we investigate into a novel apprenticeship learning-based approach which is used to automatically generate a hyper-heuristic for vehicle routing. This approach itself can be considered as a hyper-heuristic which operates in a train and test fashion. A state-of-the-art hyper-heuristic is chosen as an expert which is the winner of a previous hyper-heuristic competition. Trained on small vehicle routing instances, the learning approach yields various classifiers, each capturing different actions that the expert hyper-heuristic performs during the search process. Those classifiers are then used to produce a hyper-heuristic which is potentially capable of generalizing the actions of the expert hyperheuristic while solving the unseen instances. The experimental results on vehicle routing using the Hyper-heuristic Flexible (HyFlex) framework shows that the apprenticeship-learning based hyper-heuristic delivers an outstanding performance when compared to the expert and some other previously proposed hyper-heuristics

    Improving performance of a hyper-heuristic using a multilayer perceptron for vehicle routing

    Get PDF
    A hyper-heuristic is a heuristic optimisation method which generates or selects heuristics (move operators) based on a set of components while solving a computationally difficult problem. Apprenticeship learning arises while observing the behavior of an expert in action. In this study, we use a multilayer perceptron (MLP) as an apprenticeship learning algorithm to improve upon the performance of a state-of-the-art selection hyper-heuristic used as an expert, which was the winner of a cross-domain heuristic search challenge (CHeSC 2011). We collect data based on the relevant actions of the expert while solving selected vehicle routing problem instances from CHeSC 2011. Then an MLP is trained using this data to build a selection hyper-heuristic consisting of a number classifiers for heuristic selection, parameter control, and move-acceptance. The generated selection hyper-heuristic is tested on the unseen vehicle routing problem instances. The empirical results indicate the success of MLP-based hyper-heuristic achieving a better performance than the expert and some previously proposed algorithms

    The General Combinatorial Optimization Problem: Towards Automated Algorithm Design

    Get PDF
    This paper defines a new combinatorial optimisation problem, namely General Combinatorial Optimisation Problem (GCOP), whose decision variables are a set of parametric algorithmic components, i.e. algorithm design decisions. The solutions of GCOP, i.e. compositions of algorithmic components, thus represent different generic search algorithms. The objective of GCOP is to find the optimal algorithmic compositions for solving the given optimisation problems. Solving the GCOP is thus equivalent to automatically designing the best algorithms for optimisation problems. Despite recent advances, the evolutionary computation and optimisation research communities are yet to embrace formal standards that underpin automated algorithm design. In this position paper, we establish GCOP as a new standard to define different search algorithms within one unified model. We demonstrate the new GCOP model to standardise various search algorithms as well as selection hyper-heuristics. A taxonomy is defined to distinguish several widely used terminologies in automated algorithm design, namely automated algorithm composition, configuration and selection. We would like to encourage a new line of exciting research directions addressing several challenging research issues including algorithm generality, algorithm reusability, and automated algorithm design

    A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems

    Get PDF
    Many problem-specific heuristic frameworks have been developed to solve combinatorial optimization problems, but these frameworks do not generalize well to other problem domains. Metaheuristic frameworks aim to be more generalizable compared to traditional heuristics, however their performances suffer from poor selection of low-level heuristics (operators) during the search process. An example of heuristic selection in a metaheuristic framework is the adaptive layer of the popular framework of Adaptive Large Neighborhood Search (ALNS). Here, we propose a selection hyperheuristic framework that uses Deep Reinforcement Learning (Deep RL) as an alternative to the adaptive layer of ALNS. Unlike the adaptive layer which only considers heuristics’ past performance for future selection, a Deep RL agent is able to take into account additional information from the search process, e.g., the difference in objective value between iterations, to make better decisions. This is due to the representation power of Deep Learning methods and the decision making capability of the Deep RL agent which can learn to adapt to different problems and instance characteristics. In this paper, by integrating the Deep RL agent into the ALNS framework, we introduce Deep Reinforcement Learning Hyperheuristic (DRLH), a general framework for solving a wide variety of combinatorial optimization problems and show that our framework is better at selecting low-level heuristics at each step of the search process compared to ALNS and a Uniform Random Selection (URS). Our experiments also show that while ALNS can not properly handle a large pool of heuristics, DRLH is not negatively affected by increasing the number of heuristics.publishedVersio

    Simple hyper-heuristics control the neighbourhood size of randomised local search optimally for LeadingOnes

    Get PDF
    Selection hyper-heuristics (HHs) are randomised search methodologies which choose and execute heuristics during the optimisation process from a set of low-level heuristics. A machine learning mechanism is generally used to decide which low-level heuristic should be applied in each decision step. In this paper we analyse whether sophisticated learning mechanisms are always necessary for HHs to perform well. To this end we consider the most simple HHs from the literature and rigorously analyse their performance for the LeadingOnes benchmark function. Our analysis shows that the standard Simple Random, Permutation, Greedy and Random Gradient HHs show no signs of learning. While the former HHs do not attempt to learn from the past performance of low-level heuristics, the idea behind the Random Gradient HH is to continue to exploit the currently selected heuristic as long as it is successful. Hence, it is embedded with a reinforcement learning mechanism with the shortest possible memory. However, the probability that a promising heuristic is successful in the next step is relatively low when perturbing a reasonable solution to a combinatorial optimisation problem. We generalise the `simple' Random Gradient HH so success can be measured over a fixed period of time τ, instead of a single iteration. For LeadingOnes we prove that the Generalised Random Gradient (GRG) HH can learn to adapt the neighbourhood size of Randomised Local Search to optimality during the run. As a result, we prove it has the best possible performance achievable with the low-level heuristics (Randomised Local Search with different neighbourhood sizes), up to lower order terms. We also prove that the performance of the HH improves as the number of low-level local search heuristics to choose from increases. In particular, with access to k low-level local search heuristics, it outperforms the best-possible algorithm using any subset of the k heuristics. Finally, we show that the advantages of GRG over Randomised Local Search and Evolutionary Algorithms using standard bit mutation increase if the anytime performance is considered (i.e., the performance gap is larger if approximate solutions are sought rather than exact ones). Experimental analyses confirm these results for different problem sizes (up to n = 108) and shed some light on the best choices for the parameter τ in various situations

    Deep Learning and Deep Reinforcement Learning for Graph Based Applications

    Get PDF
    Dyp læring har gitt state-of-the-art ytelse i mange applikasjoner som datasyn, tekstanalyse, biologi, osv. Suksessen med dyp læring har også hjulpet fremveksten av dyp forsterkende læring for optimal beslutningstaking og har vist stort potensiale, spesielt i optimaliseringsproblemer. I tillegg har grafer som matematisk representasjon for strukturerte komplekse systemer vist seg å være et kraftig verktøy for analyse og problemløsning, og gitt et nytt perspektiv på formuleringen av problemet. Ved å introdusere grafer som en inputmodalitet for maskinlæringsproblemer kan dyplæringsmodeller enten bruke strukturen til grafen i sine representasjonslæringsskjema, eller optimalisere grafstrukturen i en nedstrøms evalueringsoppgave. Dette vil også føre til modellmetoder og pipelines som utnytter den strukturelle informasjonen gitt av grafer til forbedret ytelse, sammenlignet med tradisjonelle maskinlæringsmodellers kapasitet. I denne oppgaven introduserer vi fem forskjellige use-case-applikasjoner, gjennom fem forskningsartikler, som kan modelleres som grafer og tar sikte på å skape nye modeller som adresserer problemer ved bruk av dyp grafrepresentasjonslæring og dype forsterkningslæringsmodeller. Våre tre viktigste applikasjonsdomener er bioinformatikk, datasyn og logistikk. Først tar vi sikte på å adressere to problemer innen bioinformatikk. I Paper I tar vi opp spørsmålet om integrering av kontinuerlige omics-datasett med biologiske nettverk. Vi introduserer et auto-koderskjema fokusert på representasjonslæring av nodefunksjoner i biologiske nettverk, og viser anvendelsen av det utformede rammeverket i et virkelighetseksempel gjennom imputering av manglende verdier i et eksempeldatasett for omics. Paper II ser på bruk av grafrepresentasjonslæring for å behandle metabolske nettverk. I den foreslåtte tilnærmingen introduserer vi en maskinlæringspipeline (fra funksjonsekstraksjon til modellarkitektur) basert på grafiske nevrale nettverk og evaluerer pipelinen basert på prediksjon av genessensalitet, som er en velkjent bruk av metabolske banenettverk. Det andre domenet av applikasjoner er datasynsdomenet, spesifikt problemet med gjenkjennelse av menneskelige gester. I Paper III, og oppfølgingen Paper IV, introduserer vi et gestgjenkjenningssystem som er både raskere og mer nøyaktig enn den avanserte prediksjonen av menneskelige motivbevegelser fra mmWave Radar genererte punktskyer. Vi oppnår dette ved å modellere inngangspunktskyen som en spatio-temporal graf og å bearbeide den opprettede grafen ved bruk av den foreslåtte læringsteknikken for grafrepresentasjon. Videre evaluerer vi systemet under forskjellige eksperimentelle forhold ut ifra vinkelen til emnet med hensyn til sansing, og foreslår en ensembletilnærming for å dempe effekten av å endre sansevinkelen på ytelsen til modellen. Den siste applikasjonen vi tar for oss er bruken av dyp forsterkningslæring for å optimalisere strukturen til grafene i kombinatoriske optimaliseringsproblemer i logistikk. Paper V introduserer en generell problemuavhengig hyperheuristikk som utnytter beslutningsevnen til dyp forsterkende læring, ved å bruke en problemuavhengig tilstandsfunksjonsinformasjon. Det foreslåtte rammeverket er trent på en generell belønningsfunksjon for å oppnå høykvalitets ytelse blant populære løsere innen kombinatorisk optimalisering. Vi evaluerer ytelsen til den foreslåtte tilnærmingen med tre eksempler på ruting problemer samt et planleggingsproblem, for å vise effektiviteten til metoden vår i forskjellige typer problemstillinger.Deep learning has provided state-of-the-art performance in many applications such as computer vision, text analysis, biology, etc. The success of deep learning has also helped with the emergence of deep reinforcement learning for optimal decision-making and has shown great promise, especially in optimization problems. Additionally, graphs as a mathematical representation for structured complex systems have proven to be a powerful tool for analysis and problem-solving that offer a fresh perspective on the formulation of the problem. Introducing graphs as an input modality for machine learning problems enables deep learning models to either utilize the structure of the graph in their representation learning scheme or optimize the graph structure for a downstream evaluation task. Doing so will also lead to model methods and pipelines that leverage the structural information provided by graphs to improve performance compared to traditional machine learning models. In this thesis, we introduce five different use-case applications, in the format of five research papers, that can be modeled as graphs and aim to provide novel models that address problems using deep graph representation learning and deep reinforcement learning models. Our main three application domains are bioinformatics, computer vision, and logistics. First, we aim to address two problems in the domain of bioinformatics. In Paper I, we address the issue of integration of continuous omics datasets with biological networks. We introduce an auto-encoder scheme focused on representation learning of node features in biological networks and showcase the application of the designed framework in a real-world example through the imputation of missing values in an example omics dataset. Paper II looks at utilizing graph representation learning for processing metabolic networks. In the proposed approach, we introduce a machine learning pipeline (from feature extraction to model architecture) based on graph neural networks and evaluate the pipeline on the task of gene essentiality prediction which is a well-known application of metabolic pathway networks. The second domain of applications is the computer vision domain specifically the problem of human gesture recognition. In Paper III and the follow-up Paper IV, we introduce a gesture recognition system that is both faster and more accurate compared to the state-of-the-art prediction of human subject gestures from mmWave Radar generated point clouds. We achieve this by modeling the input point cloud as a spatio-temporal graph and processing the created graph using the proposed graph representation learning technique. We further evaluate the system in different experimental conditions in terms of the angle of the subject with respect to sensing and propose an ensemble approach for mitigating the effect of changing the sensing angle on the performance of the model. The last application that we address is the use of deep reinforcement learning to optimize the structure of the graphs in combinatorial optimization problems in logistics. Paper V introduces a general problem-independent hyperheuristic that utilizes the decision-making capability of deep reinforcement learning using a problem-independent state feature information. The proposed framework is trained on a general reward function to achieve state-of-the-art performance among popular solvers in the field of combinatorial optimization. We evaluate the performance of the proposed approach on three example routing problems as well as a scheduling problem to showcase the effectiveness of our method in different problems.Doktorgradsavhandlin

    Using learning from demonstration to enable automated flight control comparable with experienced human pilots

    Get PDF
    Modern autopilots fall under the domain of Control Theory which utilizes Proportional Integral Derivative (PID) controllers that can provide relatively simple autonomous control of an aircraft such as maintaining a certain trajectory. However, PID controllers cannot cope with uncertainties due to their non-adaptive nature. In addition, modern autopilots of airliners contributed to several air catastrophes due to their robustness issues. Therefore, the aviation industry is seeking solutions that would enhance safety. A potential solution to achieve this is to develop intelligent autopilots that can learn how to pilot aircraft in a manner comparable with experienced human pilots. This work proposes the Intelligent Autopilot System (IAS) which provides a comprehensive level of autonomy and intelligent control to the aviation industry. The IAS learns piloting skills by observing experienced teachers while they provide demonstrations in simulation. A robust Learning from Demonstration approach is proposed which uses human pilots to demonstrate the task to be learned in a flight simulator while training datasets are captured. The datasets are then used by Artificial Neural Networks (ANNs) to generate control models automatically. The control models imitate the skills of the experienced pilots when performing the different piloting tasks while handling flight uncertainties such as severe weather conditions and emergency situations. Experiments show that the IAS performs learned skills and tasks with high accuracy even after being presented with limited examples which are suitable for the proposed approach that relies on many single-hidden-layer ANNs instead of one or few large deep ANNs which produce a black-box that cannot be explained to the aviation regulators. The results demonstrate that the IAS is capable of imitating low-level sub-cognitive skills such as rapid and continuous stabilization attempts in stormy weather conditions, and high-level strategic skills such as the sequence of sub-tasks necessary to takeoff, land, and handle emergencies

    From metaheuristics to learnheuristics: Applications to logistics, finance, and computing

    Get PDF
    Un gran nombre de processos de presa de decisions en sectors estratègics com el transport i la producció representen problemes NP-difícils. Sovint, aquests processos es caracteritzen per alts nivells d'incertesa i dinamisme. Les metaheurístiques són mètodes populars per a resoldre problemes d'optimització difícils en temps de càlcul raonables. No obstant això, sovint assumeixen que els inputs, les funcions objectiu, i les restriccions són deterministes i conegudes. Aquests constitueixen supòsits forts que obliguen a treballar amb problemes simplificats. Com a conseqüència, les solucions poden conduir a resultats pobres. Les simheurístiques integren la simulació a les metaheurístiques per resoldre problemes estocàstics d'una manera natural. Anàlogament, les learnheurístiques combinen l'estadística amb les metaheurístiques per fer front a problemes en entorns dinàmics, en què els inputs poden dependre de l'estructura de la solució. En aquest context, les principals contribucions d'aquesta tesi són: el disseny de les learnheurístiques, una classificació dels treballs que combinen l'estadística / l'aprenentatge automàtic i les metaheurístiques, i diverses aplicacions en transport, producció, finances i computació.Un gran número de procesos de toma de decisiones en sectores estratégicos como el transporte y la producción representan problemas NP-difíciles. Frecuentemente, estos problemas se caracterizan por altos niveles de incertidumbre y dinamismo. Las metaheurísticas son métodos populares para resolver problemas difíciles de optimización de manera rápida. Sin embargo, suelen asumir que los inputs, las funciones objetivo y las restricciones son deterministas y se conocen de antemano. Estas fuertes suposiciones conducen a trabajar con problemas simplificados. Como consecuencia, las soluciones obtenidas pueden tener un pobre rendimiento. Las simheurísticas integran simulación en metaheurísticas para resolver problemas estocásticos de una manera natural. De manera similar, las learnheurísticas combinan aprendizaje estadístico y metaheurísticas para abordar problemas en entornos dinámicos, donde los inputs pueden depender de la estructura de la solución. En este contexto, las principales aportaciones de esta tesis son: el diseño de las learnheurísticas, una clasificación de trabajos que combinan estadística / aprendizaje automático y metaheurísticas, y varias aplicaciones en transporte, producción, finanzas y computación.A large number of decision-making processes in strategic sectors such as transport and production involve NP-hard problems, which are frequently characterized by high levels of uncertainty and dynamism. Metaheuristics have become the predominant method for solving challenging optimization problems in reasonable computing times. However, they frequently assume that inputs, objective functions and constraints are deterministic and known in advance. These strong assumptions lead to work on oversimplified problems, and the solutions may demonstrate poor performance when implemented. Simheuristics, in turn, integrate simulation into metaheuristics as a way to naturally solve stochastic problems, and, in a similar fashion, learnheuristics combine statistical learning and metaheuristics to tackle problems in dynamic environments, where inputs may depend on the structure of the solution. The main contributions of this thesis include (i) a design for learnheuristics; (ii) a classification of works that hybridize statistical and machine learning and metaheuristics; and (iii) several applications for the fields of transport, production, finance and computing
    corecore