1,600 research outputs found

    Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

    Get PDF
    Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time-consuming and irreproducible manual process of trial-and-error to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization. This article is categorized under: Algorithmic Development > Statistics Technologies > Machine Learning Technologies > Prediction

    Statistical Inference for Partially Observed Markov Processes via the R Package pomp

    Get PDF
    Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including sequential Monte Carlo, iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and trajectory matching. In this paper, we demonstrate the application of these methodologies using some simple toy problems. We also illustrate the specification of more complex POMP models, using a nonlinear epidemiological model with a discrete population, seasonality, and extra-demographic stochasticity. We discuss the specification of user-defined models and the development of additional methods within the programming environment provided by pomp.Comment: In press at the Journal of Statistical Software. A version of this paper is provided at the pomp package website: http://kingaa.github.io/pom

    Evolution strategies for robust optimization

    Get PDF
    Real-world (black-box) optimization problems often involve various types of uncertainties and noise emerging in different parts of the optimization problem. When this is not accounted for, optimization may fail or may yield solutions that are optimal in the classical strict notion of optimality, but fail in practice. Robust optimization is the practice of optimization that actively accounts for uncertainties and/or noise. Evolutionary Algorithms form a class of optimization algorithms that use the principle of evolution to find good solutions to optimization problems. Because uncertainty and noise are indispensable parts of nature, this class of optimization algorithms seems to be a logical choice for robust optimization scenarios. This thesis provides a clear definition of the term robust optimization and a comparison and practical guidelines on how Evolution Strategies, a subclass of Evolutionary Algorithms for real-parameter optimization problems, should be adapted for such scenarios.UBL - phd migration 201

    Evolutionary computing and particle filtering: a hardware-based motion estimation system

    Get PDF
    Particle filters constitute themselves a highly powerful estimation tool, especially when dealing with non-linear non-Gaussian systems. However, traditional approaches present several limitations, which reduce significantly their performance. Evolutionary algorithms, and more specifically their optimization capabilities, may be used in order to overcome particle-filtering weaknesses. In this paper, a novel FPGA-based particle filter that takes advantage of evolutionary computation in order to estimate motion patterns is presented. The evolutionary algorithm, which has been included inside the resampling stage, mitigates the known sample impoverishment phenomenon, very common in particle-filtering systems. In addition, a hybrid mutation technique using two different mutation operators, each of them with a specific purpose, is proposed in order to enhance estimation results and make a more robust system. Moreover, implementing the proposed Evolutionary Particle Filter as a hardware accelerator has led to faster processing times than different software implementations of the same algorithm

    A multi-objective evolutionary approach to simulation-based optimisation of real-world problems.

    Get PDF
    This thesis presents a novel evolutionary optimisation algorithm that can improve the quality of solutions in simulation-based optimisation. Simulation-based optimisation is the process of finding optimal parameter settings without explicitly examining each possible configuration of settings. An optimisation algorithm generates potential configurations and sends these to the simulation, which acts as an evaluation function. The evaluation results are used to refine the optimisation such that it eventually returns a high-quality solution. The algorithm described in this thesis integrates multi-objective optimisation, parallelism, surrogate usage, and noise handling in a unique way for dealing with simulation-based optimisation problems incurred by these characteristics. In order to handle multiple, conflicting optimisation objectives, the algorithm uses a Pareto approach in which the set of best trade-off solutions is searched for and presented to the user. The algorithm supports a high degree of parallelism by adopting an asynchronous master-slave parallelisation model in combination with an incremental population refinement strategy. A surrogate evaluation function is adopted in the algorithm to quickly identify promising candidate solutions and filter out poor ones. A novel technique based on inheritance is used to compensate for the uncertainties associated with the approximative surrogate evaluations. Furthermore, a novel technique for multi-objective problems that effectively reduces noise by adopting a dynamic procedure in resampling solutions is used to tackle the problem of real-world unpredictability (noise). The proposed algorithm is evaluated on benchmark problems and two complex real-world problems of manufacturing optimisation. The first real-world problem concerns the optimisation of a production cell at Volvo Aero, while the second one concerns the optimisation of a camshaft machining line at Volvo Cars Engine. The results from the optimisations show that the algorithm finds better solutions for all the problems considered than existing, similar algorithms. The new techniques for dealing with surrogate imprecision and noise used in the algorithm are identified as key reasons for the good performance.University of Skövde Knowledge Foundation Swede

    A Bayesian approach to constrained single- and multi-objective optimization

    Get PDF
    This article addresses the problem of derivative-free (single- or multi-objective) optimization subject to multiple inequality constraints. Both the objective and constraint functions are assumed to be smooth, non-linear and expensive to evaluate. As a consequence, the number of evaluations that can be used to carry out the optimization is very limited, as in complex industrial design optimization problems. The method we propose to overcome this difficulty has its roots in both the Bayesian and the multi-objective optimization literatures. More specifically, an extended domination rule is used to handle objectives and constraints in a unified way, and a corresponding expected hyper-volume improvement sampling criterion is proposed. This new criterion is naturally adapted to the search of a feasible point when none is available, and reduces to existing Bayesian sampling criteria---the classical Expected Improvement (EI) criterion and some of its constrained/multi-objective extensions---as soon as at least one feasible point is available. The calculation and optimization of the criterion are performed using Sequential Monte Carlo techniques. In particular, an algorithm similar to the subset simulation method, which is well known in the field of structural reliability, is used to estimate the criterion. The method, which we call BMOO (for Bayesian Multi-Objective Optimization), is compared to state-of-the-art algorithms for single- and multi-objective constrained optimization

    TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-based Intrusion Detection System

    Get PDF
    Intrusion detection systems (IDS) play a pivotal role in computer security by discovering and repealing malicious activities in computer networks. Anomaly-based IDS, in particular, rely on classification models trained using historical data to discover such malicious activities. In this paper, an improved IDS based on hybrid feature selection and two-level classifier ensembles is proposed. An hybrid feature selection technique comprising three methods, i.e. particle swarm optimization, ant colony algorithm, and genetic algorithm, is utilized to reduce the feature size of the training datasets (NSL-KDD and UNSW-NB15 are considered in this paper). Features are selected based on the classification performance of a reduced error pruning tree (REPT) classifier. Then, a two-level classifier ensembles based on two meta learners, i.e., rotation forest and bagging, is proposed. On the NSL-KDD dataset, the proposed classifier shows 85.8% accuracy, 86.8% sensitivity, and 88.0% detection rate, which remarkably outperform other classification techniques recently proposed in the literature. Results regarding the UNSW-NB15 dataset also improve the ones achieved by several state of the art techniques. Finally, to verify the results, a two-step statistical significance test is conducted. This is not usually considered by IDS research thus far and, therefore, adds value to the experimental results achieved by the proposed classifier

    Evolutionary improvement of programs

    Get PDF
    Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues
    corecore