29 research outputs found

    Meta-level learning for the effective reduction of model search space.

    Get PDF
    The exponential growth of volume, variety and velocity of the data is raising the need for investigation of intelligent ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the mapping of learning methods that leads to the optimized performance on a given task. Moreover, numerous configurations of these learning algorithms add another level of complexity. Thus, it triggers the need for an intelligent recommendation engine that can advise the best learning algorithm and its configurations for a given task. The techniques that are commonly used by experts are; trial-and-error, use their prior experience on the specific domain, etc. These techniques sometimes work for less complex tasks that require thousands of parameters to learn. However, the state-of-the-art models, e.g. deep learning models, require well-tuned hyper-parameters to learn millions of parameters which demand specialized skills and numerous computationally expensive and time-consuming trials. In that scenario, Meta-level learning can be a potential solution that can recommend the most appropriate options efficiently and effectively regardless of the complexity of data. On the contrary, Meta-learning leads to several challenges; the most critical ones being model selection and hyper-parameter optimization. The goal of this research is to investigate model selection and hyper-parameter optimization approaches of automatic machine learning in general and the challenges associated with them. In machine learning pipeline there are several phases where Meta-learning can be used to effectively facilitate the best recommendations including 1) pre-processing steps, 2) learning algorithm or their combination, 3) adaptivity mechanism parameters, 4) recurring concept extraction, and 5) concept drift detection. The scope of this research is limited to feature engineering for problem representation, and learning strategy for algorithm and its hyper-parameters recommendation at Meta-level. There are three studies conducted around the two different approaches of automatic machine learning which are model selection using Meta-learning and hyper-parameter optimization. The first study evaluates the situation in which the use of additional data from a different domain can improve the performance of a meta-learning system for time-series forecasting, with focus on cross- domain Meta-knowledge transfer. Although the experiments revealed limited room for improvement over the overall best base-learner, the meta-learning approach turned out to be a safe choice, minimizing the risk of selecting the least appropriate base-learner. There are only 2% of cases recommended by meta- learning that are the worst performing base-learning methods. The second study proposes another efficient and accurate domain adaption approach but using a different meta-learning approach. This study empirically confirms the intuition that there exists a relationship between the similarity of the two different tasks and the depth of network needed to fine-tune in order to achieve accuracy com- parable with that of a model trained from scratch. However, the approach is limited to a single hyper-parameter which is fine-tuning of the network depth based on task similarity. The final study of this research has expanded the set of hyper-parameters while implicitly considering task similarity at the intrinsic dynamics of the training process. The study presents a framework to automatically find a good set of hyper-parameters resulting in reasonably good accuracy, by framing the hyper-parameter selection and tuning within the reinforcement learning regime. The effectiveness of a recommended tuple can be tested very quickly rather than waiting for the network to converge. This approach produces accuracy close to the state-of-the-art approach and is found to be comparatively 20% less computationally expensive than previous approaches. The proposed methods in these studies, belonging to different areas of automatic machine learning, have been thoroughly evaluated on a number of benchmark datasets which confirmed the great potential of these methods

    Cross-domain Meta-learning for Time-series Forecasting.

    Get PDF
    There are many algorithms that can be used for the time-series forecasting problem, ranging from simple (e.g. Moving Average) to sophisticated Machine Learning approaches (e.g. Neural Networks). Most of these algorithms require a number of user-defined parameters to be specified, leading to exponential explosion of the space of potential solutions. since the trial-and-error approach to finding a good algorithm for solving a given problem is typically intractable, researchers and practitioners need to resort to a more intelligent search strategy, with one option being to constraint the search space using past experience - an approach known as Meta-learning. Although potentially attractive, Meta-learning comes with its own challenges. Gathering a sufficient number of Meta-examples, which in turn requires collecting and processing multiple datasets from each problem domain under consideration is perhaps the most prominent issue. In this paper, we are investigating the situations in which the use of additional data can improve performance of a Meta-learning System, with focus on cross-domain transfer of Meta-knowledge. A similarity-based cluster analysis of Meta-features has also been performed in an attempt to discover homogeneous groups of time-series with respect to Meta-learning performance. Although the experiments revealed limited room for improvement over the overall best base-learner, the Meta-learning approach turned out to be a safe choice, minimizing the risk of selecting the least appropriate base-learner

    Automatic design of artificial neural networks to forecast time series

    Get PDF
    Actas de: III Simposio de Inteligencia Computacional, SICO 2010, Valencia, 8-10 septiembre, 2010In this work an approach to design Artificial Neural Networks (ANN) to forecast Time Series is tackled. The approach is an automatic method that is carried out by an Evolutionary Algorithm (as a search algorithm) to design ANN. A key issue for these kinds of approaches is what information is included into the chromosome that represents an ANN There are two principal ideas about this question: first, the chromosome contains information about parameters of the topology, architecture, learning parameters, etc. of the ANN. The results using a parameter Encoding Scheme to design ANN for a Time Series Competition are shownPublicad

    A Review of Meta-level Learning in the Context of Multi-component, Multi-level Evolving Prediction Systems.

    Get PDF
    The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate mapping of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of learning algorithms on massive amounts of data. So there is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset. The techniques that are commonly used by experts are based on a trial and error approach evaluating and comparing a number of possible solutions against each other, using their prior experience on a specific domain, etc. The trial and error approach combined with the expert’s prior knowledge, though computationally and time expensive, have been often shown to work for stationary problems where the processing is usually performed off-line. However, this approach would not normally be feasible to apply on non-stationary problems where streams of data are continuously arriving. Furthermore, in a non-stationary environment the manual analysis of data and testing of various methods every time when there is a change in the underlying data distribution would be very difficult or simply infeasible. In that scenario and within an on-line predictive system, there are several tasks where Meta-learning can be used to effectively facilitate best recommendations including: 1) pre processing steps, 2) learning algorithms or their combination, 3) adaptivity mechanisms and their parameters, 4) recurring concept extraction, and 5) concept drift detection. However, while conceptually very attractive and promising, the Meta-learning leads to several challenges with the appropriate representation of the problem at a meta-level being one of the key ones. The goal of this review and our research is, therefore, to investigate Meta learning in general and the associated challenges in the context of automating the building, deployment and adaptation of multi-level and multi-component predictive system that evolve over time

    Meta-learning for Forecasting Model Selection

    Get PDF
    Model selection for time series forecasting is a challenging task for practitioners and academia. There are multiple approaches to address this, ranging from time series analysis using a series of statistical tests, to information criteria or empirical approaches that rely on cross-validated errors. In recent forecasting competitions, meta-learning obtained promising results establishing its place as a model selection alternative. Meta-learning constructs meta-features for each time series and trains a classifier on these to choose the most appropriate forecasting method. In the first part, this thesis studies the main components of meta-learning and analyses the effect of alternative meta-features, meta-learners, and base forecasters in the final model selection results. We investigate different meta-learners, the use of simple or complex base forecasts, and a large and diverse set of meta-features. Our findings show that stationarity tests, which identify the presence of unit root in time series, and proxies of autoregressive information, which show the strength of serial correlation in a series, have the highest importance for the performance of meta-learning. On the contrary, features related to time series quantiles and other descriptive statistics such as the mean, and the variance exhibit the lowest importance. Furthermore, we observe that using simple base forecasters is more sensitive to the number of groups of features employed as meta-feature and overall had worse performed. In terms of the choice of learners, classifiers with evidence of good performance in the literature resulted in the most accurate meta-learners. The success of meta-learning largely depends on its building components. The selection and generation of the appropriate meta-features remains a major challenge in meta-learning. In the second part, we propose using Convolutional Neural Networks (CNN) to overcome this. CNN have demonstrated breakthrough accuracy in pattern recognition tasks and can generate features as needed internally, within its layers, without intervention from the modeller. Using CNN, we provide empirical evidence of the efficacy of the approach, against widely accepted forecast selection methods and discuss the advantages and limitations of the proposed approach. Finally, we provide additional evidence that using meta-learning, for automated model selection, outperformed all of the individual benchmark forecasts

    Learning Interpretable Features of Graphs and Time Series Data

    Get PDF
    Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events

    Ensemble based on randomised neural networks for online data stream regression in presence of concept drift

    Get PDF
    The big data paradigm has posed new challenges for the Machine Learning algorithms, such as analysing continuous flows of data, in the form of data streams, and dealing with the evolving nature of the data, which cause a phenomenon often referred to in the literature as concept drift. Concept drift is caused by inconsistencies between the optimal hypotheses in two subsequent chunks of data, whereby the concept underlying a given process evolves over time, which can happen due to several factors including change in consumer preference, economic dynamics, or environmental conditions. This thesis explores the problem of data stream regression with the presence of concept drift. This problem requires computationally efficient algorithms that are able to adapt to the various types of drift that may affect the data. The development of effective algorithms for data streams with concept drift requires several steps that are discussed in this research. The first one is related to the datasets required to assess the algorithms. In general, it is not possible to determine the occurrence of concept drift on real-world datasets; therefore, synthetic datasets where the various types of concept drift can be simulated are required. The second issue is related to the choice of the algorithm. The ensemble algorithms show many advantages to deal with concept drifting data streams, which include flexibility, computational efficiency and high accuracy. For the design of an effective ensemble, this research analyses the use of randomised Neural Networks as base models, along with their optimisation. The optimisation of the randomised Neural Networks involves design and tuning hyperparameters which may substantially affect its performance. The optimisation of the base models is an important aspect to build highly accurate and computationally efficient ensembles. To cope with the concept drift, the existing methods either require setting fixed updating points, which may result in unnecessary computations or slow reaction to concept drift, or rely on drifting detection mechanism, which may be ineffective due to the difficulty to detect drift in real applications. Therefore, the research contributions of this thesis include the development of a new approach for synthetic dataset generation, development of a new hyperparameter optimisation algorithm that reduces the search effort and the need of prior assumptions compared to existing methods, the analysis of the effects of randomised Neural Networks hyperparameters, and the development of a new ensemble algorithm based on bagging meta-model that reduces the computational effort over existing methods and uses an innovative updating mechanism to cope with concept drift. The algorithms have been tested on synthetic datasets and validated on four real-world datasets from various application domains

    The Technological Emergence of AutoML: A Survey of Performant Software and Applications in the Context of Industry

    Full text link
    With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and industry. For the still relatively young field of Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period is under way, spurred on by a burgeoning interest from broader society. Yet, to date, little research has been undertaken to assess the current state of this dissemination and its uptake. Thus, this review makes two primary contributions to knowledge around this topic. Firstly, it provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial. Secondly, it motivates and outlines a framework for assessing whether an AutoML solution designed for real-world application is 'performant'; this framework extends beyond the limitations of typical academic criteria, considering a variety of stakeholder needs and the human-computer interactions required to service them. Thus, additionally supported by an extensive assessment and comparison of academic and commercial case-studies, this review evaluates mainstream engagement with AutoML in the early 2020s, identifying obstacles and opportunities for accelerating future uptake

    Emotion and Stress Recognition Related Sensors and Machine Learning Technologies

    Get PDF
    This book includes impactful chapters which present scientific concepts, frameworks, architectures and ideas on sensing technologies and machine learning techniques. These are relevant in tackling the following challenges: (i) the field readiness and use of intrusive sensor systems and devices for capturing biosignals, including EEG sensor systems, ECG sensor systems and electrodermal activity sensor systems; (ii) the quality assessment and management of sensor data; (iii) data preprocessing, noise filtering and calibration concepts for biosignals; (iv) the field readiness and use of nonintrusive sensor technologies, including visual sensors, acoustic sensors, vibration sensors and piezoelectric sensors; (v) emotion recognition using mobile phones and smartwatches; (vi) body area sensor networks for emotion and stress studies; (vii) the use of experimental datasets in emotion recognition, including dataset generation principles and concepts, quality insurance and emotion elicitation material and concepts; (viii) machine learning techniques for robust emotion recognition, including graphical models, neural network methods, deep learning methods, statistical learning and multivariate empirical mode decomposition; (ix) subject-independent emotion and stress recognition concepts and systems, including facial expression-based systems, speech-based systems, EEG-based systems, ECG-based systems, electrodermal activity-based systems, multimodal recognition systems and sensor fusion concepts and (x) emotion and stress estimation and forecasting from a nonlinear dynamical system perspective
    corecore