13 research outputs found

    Bayesian learning of inverted Dirichlet mixtures for SVM kernels generation

    Get PDF
    We describe approaches for positive data modeling and classification using both finite inverted Dirichlet mixture models and support vector machines (SVMs). Inverted Dirichlet mixture models are used to tackle an outstanding challenge in SVMs namely the generation of accurate kernels. The kernels generation approaches, grounded on ideas from information theory that we consider, allow the incorporation of data structure and its structural constraints. Inverted Dirichlet mixture models are learned within a principled Bayesian framework using both Gibbs sampler and Metropolis-Hastings for parameter estimation and Bayes factor for model selection (i.e., determining the number of mixture’s components). Our Bayesian learning approach uses priors, which we derive by showing that the inverted Dirichlet distribution belongs to the family of exponential distributions, over the model parameters, and then combines these priors with information from the data to build posterior distributions. We illustrate the merits and the effectiveness of the proposed method with two real-world challenging applications namely object detection and visual scenes analysis and classification

    On the predictive power of meta-features in OpenML

    Get PDF
    The demand for performing data analysis is steadily rising. As a consequence, people of different profiles (i.e., non-experienced users) have started to analyze their data. However, this is challenging for them. A key step that poses difficulties and determines the success of the analysis is data mining (model/algorithm selection problem). Meta-learning is a technique used for assisting non-expert users in this step. The effectiveness of meta-learning is, however, largely dependent on the description/characterization of datasets (i.e., meta-features used for meta-learning). There is a need for improving the effectiveness of meta-learning by identifying and designing more predictive meta-features. In this work, we use a method from exploratory factor analysis to study the predictive power of different meta-features collected in OpenML, which is a collaborative machine learning platform that is designed to store and organize meta-data about datasets, data mining algorithms, models and their evaluations. We first use the method to extract latent features, which are abstract concepts that group together meta-features with common characteristics. Then, we study and visualize the relationship of the latent features with three different performance measures of four classification algorithms on hundreds of datasets available in OpenML, and we select the latent features with the highest predictive power. Finally, we use the selected latent features to perform meta-learning and we show that our method improves the meta-learning process. Furthermore, we design an easy to use application for retrieving different meta-data from OpenML as the biggest source of data in this domain.Peer ReviewedPostprint (published version

    A Review of Meta-level Learning in the Context of Multi-component, Multi-level Evolving Prediction Systems.

    Get PDF
    The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate mapping of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of learning algorithms on massive amounts of data. So there is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset. The techniques that are commonly used by experts are based on a trial and error approach evaluating and comparing a number of possible solutions against each other, using their prior experience on a specific domain, etc. The trial and error approach combined with the expert’s prior knowledge, though computationally and time expensive, have been often shown to work for stationary problems where the processing is usually performed off-line. However, this approach would not normally be feasible to apply on non-stationary problems where streams of data are continuously arriving. Furthermore, in a non-stationary environment the manual analysis of data and testing of various methods every time when there is a change in the underlying data distribution would be very difficult or simply infeasible. In that scenario and within an on-line predictive system, there are several tasks where Meta-learning can be used to effectively facilitate best recommendations including: 1) pre processing steps, 2) learning algorithms or their combination, 3) adaptivity mechanisms and their parameters, 4) recurring concept extraction, and 5) concept drift detection. However, while conceptually very attractive and promising, the Meta-learning leads to several challenges with the appropriate representation of the problem at a meta-level being one of the key ones. The goal of this review and our research is, therefore, to investigate Meta learning in general and the associated challenges in the context of automating the building, deployment and adaptation of multi-level and multi-component predictive system that evolve over time

    A Research on Automatic Hyperparameter Recommendation via Meta-Learning

    Get PDF
    The performance of classification algorithms is mainly governed by the hyperparameter configurations deployed. Traditional search-based algorithms tend to require extensive hyperparameter evaluations to select the desirable configurations during the process, and they are often very inefficient for implementations on large-scale tasks. In this dissertation, we resort to solving the problem of hyperparameter selection via meta-learning which provides a mechanism that automatically recommends the promising ones without any inefficient evaluations. In its approach, a meta-learner is constructed on the metadata extracted from historical classification problems which directly determines the success of recommendations. Designing fine meta-learners to recommend effective hyperparameter configurations efficiently is of practical importance. This dissertation divides into six chapters: the first chapter presents the research background and related work, the second to the fifth chapters detail our main work and contributions, and the sixth chapter concludes the dissertation and pictures our possible future work. In the second and third chapters, we propose two (kernel) multivariate sparse-group Lasso (SGLasso) approaches for automatic meta-feature selection. Previously, meta-features were usually picked by researchers manually based on their preferences and experience or by wrapper method, which is either less effective or time-consuming. SGLasso, as an embedded feature selection model, can select the most effective meta-features during the meta-learner training and thus guarantee the optimality of both meta-features and meta-learner which are essential for successful recommendations. In the fourth chapter, we formulate the problem of hyperparameter recommendation as a problem of low-rank tensor completion. The hyperparameter search space was often stretched to a one-dimensional vector, which removes the spatial structure of the search space and ignores the correlations that existed between the adjacent hyperparameters and these characteristics are crucial in meta-learning. Our contributions are to instantiate the search space of hyperparameters as a multi-dimensional tensor and develop a novel kernel tensor completion algorithm that is applied to estimate the performance of hyperparameter configurations. In the fifth chapter, we propose to learn the latent features of performance space via denoising autoencoders. Although the search space is usually high-dimensional, the performance of hyperparameter configurations is usually correlated to each other to a certain degree and its main structure lies in a much lower-dimensional manifold that describes the performance distribution of the search space. Denoising autoencoders are applied to extract the latent features on which two effective recommendation strategies are built. Extensive experiments are conducted to verify the effectiveness of our proposed approaches, and various empirical outcomes have shown that our approaches can recommend promising hyperparameters for real problems and significantly outperform the state-of-the-art meta-learning-based methods as well as search algorithms such as random search, Bayesian optimization, and Hyperband

    Towards principled learner selection

    Get PDF
    Long learner evaluation times are no longer exceptional and often there is insufficient time to exhaustively test all candidate options. When deciding which learners to use, practitioners must rely on ad hoc testing and luck to identify the most accurate one. Given the importance of classification in decision making, this is unsatisfactory. Progress towards a principled approach requires accurate predictions of learner accuracy and evaluation time and this study examines the potential of traditional meta-learning approaches, with their emphasis on indirect explanatory variables, to deliver the required solutions. Here, 57 different indirect dataset characteristics, including those related to geometrical complexity, are used as explanatory variables, alongside sample-estimates, in building regression models of accuracy and time. The evidence presented firmly suggests that these indirect variables lack both the required predictive power and the time efficiency required for the development of practically useful models, and points instead towards basing the prediction of learner accuracy solely on sample-based models. The attempt at modelling learner evaluation time reveals some of the difficulties that this tough challenge presents

    Towards principled learner selection

    Get PDF
    Long learner evaluation times are no longer exceptional and often there is insufficient time to exhaustively test all candidate options. When deciding which learners to use, practitioners must rely on ad hoc testing and luck to identify the most accurate one. Given the importance of classification in decision making, this is unsatisfactory. Progress towards a principled approach requires accurate predictions of learner accuracy and evaluation time and this study examines the potential of traditional meta-learning approaches, with their emphasis on indirect explanatory variables, to deliver the required solutions. Here, 57 different indirect dataset characteristics, including those related to geometrical complexity, are used as explanatory variables, alongside sample-estimates, in building regression models of accuracy and time. The evidence presented firmly suggests that these indirect variables lack both the required predictive power and the time efficiency required for the development of practically useful models, and points instead towards basing the prediction of learner accuracy solely on sample-based models. The attempt at modelling learner evaluation time reveals some of the difficulties that this tough challenge presents