3,828 research outputs found

    Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms

    Full text link
    Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that addresses these issues in isolation. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection/hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.Comment: 9 pages, 3 figure

    New strategies for the aerodynamic design optimization of aeronautical configurations through soft-computing techniques

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en 2013Lozano Rodríguez, Carlos, codir.This thesis deals with the improvement of the optimization process in the aerodynamic design of aeronautical configurations. Nowadays, this topic is of great importance in order to allow the European aeronautical industry to reduce their development and operational costs, decrease the time-to-market for new aircraft, improve the quality of their products and therefore maintain their competitiveness. Within this thesis, a study of the state-of-the-art of the aerodynamic optimization tools has been performed, and several contributions have been proposed at different levels: -One of the main drawbacks for an industrial application of aerodynamic optimization tools is the huge requirement of computational resources, in particular, for complex optimization problems, current methodological approaches would need more than a year to obtain an optimized aircraft. For this reason, one proposed contribution of this work is focused on reducing the computational cost by the use of different techniques as surrogate modelling, control theory, as well as other more software-related techniques as code optimization and proper domain parallelization, all with the goal of decreasing the cost of the aerodynamic design process. -Other contribution is related to the consideration of the design process as a global optimization problem, and, more specifically, the use of evolutionary algorithms (EAs) to perform a preliminary broad exploration of the design space, due to their ability to obtain global optima. Regarding this, EAs have been hybridized with metamodels (or surrogate models), in order to substitute expensive CFD simulations. In this thesis, an innovative approach for the global aerodynamic optimization of aeronautical configurations is proposed, consisting of an Evolutionary Programming algorithm hybridized with a Support Vector regression algorithm (SVMr) as a metamodel. Specific issues as precision, dataset training size, geometry parameterization sensitivity and techniques for design of experiments are discussed and the potential of the proposed approach to achieve innovative shapes that would not be achieved with traditional methods is assessed. -Then, after a broad exploration of the design space, the optimization process is continued with local gradient-based optimization techniques for a finer improvement of the geometry. Here, an automated optimization framework is presented to address aerodynamic shape design problems. Key aspects of this framework include the use of the adjoint methodology to make the computational requirements independent of the number of design variables, and Computer Aided Design (CAD)-based shape parameterization, which uses the flexibility of Non-Uniform Rational B-Splines (NURBS) to handle complex configurations. The mentioned approach is applied to the optimization of several test cases and the improvements of the proposed strategy and its ability to achieve efficient shapes will complete this study

    New strategies for the aerodynamic design optimization of aeronautical configurations through soft-computing techniques

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en 2013Lozano Rodríguez, Carlos, codir.This thesis deals with the improvement of the optimization process in the aerodynamic design of aeronautical configurations. Nowadays, this topic is of great importance in order to allow the European aeronautical industry to reduce their development and operational costs, decrease the time-to-market for new aircraft, improve the quality of their products and therefore maintain their competitiveness. Within this thesis, a study of the state-of-the-art of the aerodynamic optimization tools has been performed, and several contributions have been proposed at different levels: -One of the main drawbacks for an industrial application of aerodynamic optimization tools is the huge requirement of computational resources, in particular, for complex optimization problems, current methodological approaches would need more than a year to obtain an optimized aircraft. For this reason, one proposed contribution of this work is focused on reducing the computational cost by the use of different techniques as surrogate modelling, control theory, as well as other more software-related techniques as code optimization and proper domain parallelization, all with the goal of decreasing the cost of the aerodynamic design process. -Other contribution is related to the consideration of the design process as a global optimization problem, and, more specifically, the use of evolutionary algorithms (EAs) to perform a preliminary broad exploration of the design space, due to their ability to obtain global optima. Regarding this, EAs have been hybridized with metamodels (or surrogate models), in order to substitute expensive CFD simulations. In this thesis, an innovative approach for the global aerodynamic optimization of aeronautical configurations is proposed, consisting of an Evolutionary Programming algorithm hybridized with a Support Vector regression algorithm (SVMr) as a metamodel. Specific issues as precision, dataset training size, geometry parameterization sensitivity and techniques for design of experiments are discussed and the potential of the proposed approach to achieve innovative shapes that would not be achieved with traditional methods is assessed. -Then, after a broad exploration of the design space, the optimization process is continued with local gradient-based optimization techniques for a finer improvement of the geometry. Here, an automated optimization framework is presented to address aerodynamic shape design problems. Key aspects of this framework include the use of the adjoint methodology to make the computational requirements independent of the number of design variables, and Computer Aided Design (CAD)-based shape parameterization, which uses the flexibility of Non-Uniform Rational B-Splines (NURBS) to handle complex configurations. The mentioned approach is applied to the optimization of several test cases and the improvements of the proposed strategy and its ability to achieve efficient shapes will complete this study

    Image Outlier filtering (IOF) : A Machine learning based DWT optimization Approach

    Get PDF
    In this paper an image outlier technique, which is a hybrid model called SVM regression based DWT optimization have been introduced. Outlier filtering of RGB image is using the DWT model such as Optimal-HAAR wavelet changeover (OHC), which optimized by the Least Square Support Vector Machine (LS-SVM) . The LS-SVM regression predicts hyper coefficients obtained by using QPSO model. The mathematical models are discussed in brief in this paper: (i) OHC which results in better performance and reduces the complexity resulting in (Optimized FHT). (ii) QPSO by replacing the least good particle with the new best obtained particle resulting in 201C;Optimized Least Significant Particle based QPSO201D; (OLSP-QPSO). On comparing the proposed cross model of optimizing DWT by LS-SVM to perform oulier filtering with linear and nonlinear noise removal standards

    Non-linear Machine Learning with Active Sampling for MOX Drift Compensation

    Get PDF
    Abstract—Metal oxide (MOX) gas detectors based on SnO2 provide low-cost solutions for real-time sensing of complex gas mixtures for indoor ambient monitoring. With high sensitivity under ideal conditions, MOX detectors may have poor longterm response accuracy due to environmental factors (humidity and temperature) along with sensor aging, leading to calibration drifts. Finding a simple and efficient solution to correct such calibration drifts has been the subject of numerous studies but remains an open problem. In this work, we present an efficient approach to MOX calibration using active and transfer sampling techniques coupled with non-linear machine learning algorithms, namely neural networks, extreme gradient boosting (XGBoost) and radial kernel support vector machines (SVM). Applied on the UCI’s HT detectors dataset, the study evaluates methods for active sampling, makes an assessment of suitable neural networks architectures and compares the performance of neural networks, XGBoost and radial kernel SVM to classify gas mixtures (banana and wine odours, clean air) in the presence of humidity and temperature changes. The results show high classification accuracy levels (above 90%) and confirm that active sampling can provide a suitable solution. Index Terms—Neural Networks, Extreme Gradient Boosting, XGBoost, Support Vector Machines, Non-Linear Learning Methods, Machine Learnin

    Novel MLR-RF-Based Geospatial Techniques: A Comparison with OK

    Get PDF
    Geostatistical estimation methods rely on experimental variograms that are mostly erratic, leading to subjective model fitting and assuming normal distribution during conditional simula-tions. In contrast, Machine Learning Algorithms (MLA) are (1) free of such limitations, (2) can in-corporate information from multiple sources and therefore emerge with increasing interest in real-time resource estimation and automation. However, MLAs need to be explored for robust learning of phenomena, better accuracy, and computational efficiency. This paper compares MLAs, i.e., Multiple Linear Regression (MLR) and Random Forest (RF), with Ordinary Kriging (OK). The techniques were applied to the publicly available Walkerlake dataset, while the exhaustive Walker Lake dataset was validated. The results of MLR were significant (p \u3c 10 × 10−5), with correlation coeffi-cients of 0.81 (R-square = 0.65) compared to 0.79 (R-square = 0.62) from the RF and OK methods. Additionally, MLR was automated (free from an intermediary step of variogram modelling as in OK), produced unbiased estimates, identified key samples representing different zones, and had higher computational efficiency

    Automatic machine learning:methods, systems, challenges

    Get PDF

    Machine Learning with Time Series: A Taxonomy of Learning Tasks, Development of a Unified Framework, and Comparative Benchmarking of Algorithms

    Get PDF
    Time series data is ubiquitous in real-world applications. Such data gives rise to distinct but closely related learning tasks (e.g. time series classification, regression or forecasting). In contrast to the more traditional cross-sectional setting, these tasks are often not fully formalized. As a result, different tasks can become conflated under the same name, algorithms are often applied to the wrong task, and performance estimates are are potentially unreliable. In practice, software frameworks such as scikit-learn have become essential tools for data science. However, most existing frameworks focus on cross-sectional data. To our know- ledge, no comparable frameworks exist for temporal data. Moreover, despite the importance of these framework, their design principles have never been fully understood. Instead, discussions often concentrate on the usage and features, while almost completely ignoring the design. To address these issues, we develop in this thesis (i) a formal taxonomy of learning tasks, (ii) novel design principles for ML toolboxes and (iii) a new unified framework for ML with time series. The framework has been implemented in an open-source Python package called sktime. The design principles are derived from existing state-of-the-art toolboxes and classical software design practices, using a domain-driven approach and a novel scientific type system. We show that these principles cannot just explain key aspects of existing frameworks, but also guide the development of new ones like sktime. Finally, we use sktime to reproduce and extend the M4 competition, one of the major comparative benchmarking studies for forecasting. Reproducing the competition allows us to verify the published results and illustrate sktime’s effectiveness. Extending the competition enables us to explore the potential of previously unstudied ML models. We find that, on a subset of the M4 data, simple ML models implemented in sktime can match the state-of-the-art performance of the hand-crafted M4 winner models
    corecore