1,467 research outputs found
Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces
Tree ensembles can be well-suited for black-box optimization tasks such as
algorithm tuning and neural architecture search, as they achieve good
predictive performance with little or no manual tuning, naturally handle
discrete feature spaces, and are relatively insensitive to outliers in the
training data. Two well-known challenges in using tree ensembles for black-box
optimization are (i) effectively quantifying model uncertainty for exploration
and (ii) optimizing over the piece-wise constant acquisition function. To
address both points simultaneously, we propose using the kernel interpretation
of tree ensembles as a Gaussian Process prior to obtain model variance
estimates, and we develop a compatible optimization formulation for the
acquisition function. The latter further allows us to seamlessly integrate
known constraints to improve sampling efficiency by considering
domain-knowledge in engineering settings and modeling search space symmetries,
e.g., hierarchical relationships in neural architecture search. Our framework
performs as well as state-of-the-art methods for unconstrained black-box
optimization over continuous/discrete features and outperforms competing
methods for problems combining mixed-variable feature spaces and known input
constraints.Comment: 27 pages, 9 figures, 4 table
Ensemble based on randomised neural networks for online data stream regression in presence of concept drift
The big data paradigm has posed new challenges for the Machine Learning algorithms, such as analysing continuous flows of data, in the form of data streams, and dealing with the evolving nature of the data, which cause a phenomenon often referred to in the literature as concept drift. Concept drift is caused by inconsistencies between the optimal hypotheses in two subsequent chunks of data, whereby the concept underlying a given process evolves over time, which can happen due to several factors including change in consumer preference, economic dynamics, or environmental conditions. This thesis explores the problem of data stream regression with the presence of concept drift. This problem requires computationally efficient algorithms that are able to adapt to the various types of drift that may affect the data. The development of effective algorithms for data streams with concept drift requires several steps that are discussed in this research. The first one is related to the datasets required to assess the algorithms. In general, it is not possible to determine the occurrence of concept drift on real-world datasets; therefore, synthetic datasets where the various types of concept drift can be simulated are required. The second issue is related to the choice of the algorithm. The ensemble algorithms show many advantages to deal with concept drifting data streams, which include flexibility, computational efficiency and high accuracy. For the design of an effective ensemble, this research analyses the use of randomised Neural Networks as base models, along with their optimisation. The optimisation of the randomised Neural Networks involves design and tuning hyperparameters which may substantially affect its performance. The optimisation of the base models is an important aspect to build highly accurate and computationally efficient ensembles. To cope with the concept drift, the existing methods either require setting fixed updating points, which may result in unnecessary computations or slow reaction to concept drift, or rely on drifting detection mechanism, which may be ineffective due to the difficulty to detect drift in real applications. Therefore, the research contributions of this thesis include the development of a new approach for synthetic dataset generation, development of a new hyperparameter optimisation algorithm that reduces the search effort and the need of prior assumptions compared to existing methods, the analysis of the effects of randomised Neural Networks hyperparameters, and the development of a new ensemble algorithm based on bagging meta-model that reduces the computational effort over existing methods and uses an innovative updating mechanism to cope with concept drift. The algorithms have been tested on synthetic datasets and validated on four real-world datasets from various application domains
Evaluating an automated procedure of machine learning parameter tuning for software effort estimation
Software effort estimation requires accurate prediction models. Machine learning algorithms have been used to create more accurate estimation models. However, these algorithms are sensitive to factors such as the choice of hyper-parameters. To reduce this sensitivity, automated approaches for hyper-parameter tuning have been recently investigated. There is a need for further research on the effectiveness of such approaches in the context of software effort estimation. These evaluations could help understand which hyper-parameter settings can be adjusted to improve model accuracy, and in which specific contexts tuning can benefit model performance.
The goal of this work is to develop an automated procedure for machine learning hyper-parameter tuning in the context of software effort estimation. The automated procedure builds and evaluates software effort estimation models to determine the most accurate evaluation schemes. The methodology followed in this work consists of first performing a systematic mapping study to characterize existing hyper-parameter tuning approaches in software effort estimation, developing the procedure to automate the evaluation of hyper-parameter tuning, and conducting controlled quasi experiments to evaluate the automated procedure.
From the systematic literature mapping we discovered that effort estimation literature has favored the use of grid search. The results we obtained in our quasi experiments demonstrated that fast, less exhaustive tuners were viable in place of grid search. These results indicate that randomly evaluating 60 hyper-parameters can be as good as grid search, and that multiple state-of-the-art tuners were only more effective than this random search in 6% of the evaluated dataset-model combinations. We endorse random search, genetic algorithms, flash, differential evolution, and tabu and harmony search as effective tuners.Los algoritmos de aprendizaje automático han sido utilizados para crear modelos con mayor precisión para la estimación del esfuerzo del desarrollo de software. Sin embargo, estos algoritmos son sensibles a factores, incluyendo la selección de hiper parámetros. Para reducir esto, se han investigado recientemente algoritmos de ajuste automático de hiper parámetros. Es necesario evaluar la efectividad de estos algoritmos en el contexto de estimación de esfuerzo. Estas evaluaciones podrían ayudar a entender qué hiper parámetros se pueden ajustar para mejorar los modelos, y en qué contextos esto ayuda el rendimiento de los modelos.
El objetivo de este trabajo es desarrollar un procedimiento automatizado para el ajuste de hiper parámetros para algoritmos de aprendizaje automático aplicados a la estimación de esfuerzo del desarrollo de software. La metodología seguida en este trabajo consta de realizar un estudio de mapeo sistemático para caracterizar los algoritmos de ajuste existentes, desarrollar el procedimiento automatizado, y conducir cuasi experimentos controlados para evaluar este procedimiento.
Mediante el mapeo sistemático descubrimos que la literatura en estimación de esfuerzo ha favorecido el uso de la búsqueda en cuadrícula. Los resultados obtenidos en nuestros cuasi experimentos demostraron que algoritmos de estimación no-exhaustivos son viables para la estimación de esfuerzo. Estos resultados indican que evaluar aleatoriamente 60 hiper parámetros puede ser tan efectivo como la búsqueda en cuadrícula, y que muchos de los métodos usados en el estado del arte son solo más efectivos que esta búsqueda aleatoria en 6% de los escenarios. Recomendamos el uso de la búsqueda aleatoria, algoritmos genéticos y similares, y la búsqueda tabú y harmónica.Escuela de Ciencias de la Computación e InformáticaCentro de Investigaciones en Tecnologías de la Información y ComunicaciónUCR::Vicerrectoría de Investigación::Sistema de Estudios de Posgrado::Ingeniería::Maestría Académica en Computación e Informátic
Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking
Machine-learned models are often described as "black boxes". In many
real-world applications however, models may have to sacrifice predictive power
in favour of human-interpretability. When this is the case, feature engineering
becomes a crucial task, which requires significant and time-consuming human
effort. Whilst some features are inherently static, representing properties
that cannot be influenced (e.g., the age of an individual), others capture
characteristics that could be adjusted (e.g., the daily amount of carbohydrates
taken). Nonetheless, once a model is learned from the data, each prediction it
makes on new instances is irreversible - assuming every instance to be a static
point located in the chosen feature space. There are many circumstances however
where it is important to understand (i) why a model outputs a certain
prediction on a given instance, (ii) which adjustable features of that instance
should be modified, and finally (iii) how to alter such a prediction when the
mutated instance is input back to the model. In this paper, we present a
technique that exploits the internals of a tree-based ensemble classifier to
offer recommendations for transforming true negative instances into positively
predicted ones. We demonstrate the validity of our approach using an online
advertising application. First, we design a Random Forest classifier that
effectively separates between two types of ads: low (negative) and high
(positive) quality ads (instances). Then, we introduce an algorithm that
provides recommendations that aim to transform a low quality ad (negative
instance) into a high quality one (positive instance). Finally, we evaluate our
approach on a subset of the active inventory of a large ad network, Yahoo
Gemini.Comment: 10 pages, KDD 201
“Dust in the wind...”, deep learning application to wind energy time series forecasting
To balance electricity production and demand, it is required to use different prediction techniques extensively. Renewable energy, due to its intermittency, increases the complexity and uncertainty of forecasting, and the resulting accuracy impacts all the different players acting around the electricity systems around the world like generators, distributors, retailers, or consumers. Wind forecasting can be done under two major approaches, using meteorological numerical prediction models or based on pure time series input. Deep learning is appearing as a new method that can be used for wind energy prediction. This work develops several deep learning architectures and shows their performance when applied to wind time series. The models have been tested with the most extensive wind dataset available, the National Renewable Laboratory Wind Toolkit, a dataset with 126,692 wind points in North America. The architectures designed are based on different approaches, Multi-Layer Perceptron Networks (MLP), Convolutional Networks (CNN), and Recurrent Networks (RNN). These deep learning architectures have been tested to obtain predictions in a 12-h ahead horizon, and the accuracy is measured with the coefficient of determination, the R² method. The application of the models to wind sites evenly distributed in the North America geography allows us to infer several conclusions on the relationships between methods, terrain, and forecasting complexity. The results show differences between the models and confirm the superior capabilities on the use of deep learning techniques for wind speed forecasting from wind time series data.Peer ReviewedPostprint (published version
Dynamic Advisor-Based Ensemble (dynABE): Case study in stock trend prediction of critical metal companies
Stock trend prediction is a challenging task due to the market's noise, and
machine learning techniques have recently been successful in coping with this
challenge. In this research, we create a novel framework for stock prediction,
Dynamic Advisor-Based Ensemble (dynABE). dynABE explores domain-specific areas
based on the companies of interest, diversifies the feature set by creating
different "advisors" that each handles a different area, follows an effective
model ensemble procedure for each advisor, and combines the advisors together
in a second-level ensemble through an online update strategy we developed.
dynABE is able to adapt to price pattern changes of the market during the
active trading period robustly, without needing to retrain the entire model. We
test dynABE on three cobalt-related companies, and it achieves the best-case
misclassification error of 31.12% and an annualized absolute return of 359.55%
with zero maximum drawdown. dynABE also consistently outperforms the baseline
models of support vector machine, neural network, and random forest in all case
studies.Comment: This is the latest version published in Plos ON
- …