13 research outputs found

    Improving generalisation of AutoML systems with dynamic fitness evaluations

    Full text link
    A common problem machine learning developers are faced with is overfitting, that is, fitting a pipeline too closely to the training data that the performance degrades for unseen data. Automated machine learning aims to free (or at least ease) the developer from the burden of pipeline creation, but this overfitting problem can persist. In fact, this can become more of a problem as we look to iteratively optimise the performance of an internal cross-validation (most often \textit{k}-fold). While this internal cross-validation hopes to reduce this overfitting, we show we can still risk overfitting to the particular folds used. In this work, we aim to remedy this problem by introducing dynamic fitness evaluations which approximate repeated \textit{k}-fold cross-validation, at little extra cost over single \textit{k}-fold, and far lower cost than typical repeated \textit{k}-fold. The results show that when time equated, the proposed fitness function results in significant improvement over the current state-of-the-art baseline method which uses an internal single \textit{k}-fold. Furthermore, the proposed extension is very simple to implement on top of existing evolutionary computation methods, and can provide essentially a free boost in generalisation/testing performance.Comment: 19 pages, 4 figure

    Recognizing Affiliation: Using Behavioural Traces to Predict the Quality of Social Interactions in Online Games

    Full text link
    Online social interactions in multiplayer games can be supportive and positive or toxic and harmful; however, few methods can easily assess interpersonal interaction quality in games. We use behavioural traces to predict affiliation between dyadic strangers, facilitated through their social interactions in an online gaming setting. We collected audio, video, in-game, and self-report data from 23 dyads, extracted 75 features, trained Random Forest and Support Vector Machine models, and evaluated their performance predicting binary (high/low) as well as continuous affiliation toward a partner. The models can predict both binary and continuous affiliation with up to 79.1% accuracy (F1) and 20.1% explained variance (R2) on unseen data, with features based on verbal communication demonstrating the highest potential. Our findings can inform the design of multiplayer games and game communities, and guide the development of systems for matchmaking and mitigating toxic behaviour in online games.Comment: CHI '2

    Towards Declarative Statistical Inference

    No full text
    Wide-ranging digitalization has made it possible to capture increasingly larger amounts of data. In order to transform this raw data into meaningful insights, data analytics and statistical inference techniques are essential. However, while it is expected that a researcher is an expert in their own field, it is not self-evident that they are also proficient in statistics. In fact, it is known that statistical inference is a labor-intensive and error-prone task. This dissertation aims to understand current statistical inference practices for the experimental evaluation of machine learning algorithms, and proposes improvements where possible. It takes a small step forward towards the goal of automating the data analysis component of empirical research, making the process more robust in terms of correct execution and interpretation of the results. Our first contribution is a synthesis of existing knowledge about error estimation of supervised learning algorithms with cross-validation. We highlight the distinction between model and learner error, and investigate the effect of repeating cross-validation on the quality of the error estimate. Next, we focus on the evaluation of multi-instance learning algorithms. Here, instances are not labeled individually, but instead are grouped together in bags and only the bag label is known. Our second contribution is an investigation of the extent to which conclusions about bag-level performance can be generalized to the instance-level. Our third contribution is a meta-learning experiment in which we predict the most suitable multi-instance learner for a given problem. The intricate nature of statistical inference begs the question whether this aspect of research cannot be automated. One requirement for this is the availability of a model representing all relevant characteristics of the population under study. Bayesian networks are a candidate for this, as they concisely describe the joint probability distribution of a set of random variables, and come with a plethora of efficient inference methods. Our last contribution is a theoretical proposal of a greedy-hill climbing structure learning algorithm for Bayesian networks.status: publishe

    A declarative query language for statistical inference

    Get PDF
    We present a preliminary design of an experimentation system that consists of a declarative language and an inference engine. The language allows to formulate a hypothesis about a data population, whereafter the inference engine automatically provides an answer, based on a limited sample.status: publishe

    A meta-learning system for multi-instance classification

    No full text
    Meta-learning refers to the use of machine learning methods to analyze the behavior of machine learning methods on different types of datasets. Until now, meta-learning has mostly focused on the standard classification setting. In this paper about ongoing work, we apply it to multi-instance classification, an alternative classification setting in which bags of instances, rather than individual instances, are labeled. We define a number of data set properties that are specific to the multi-instance setting, and extend the concept of landmarkers to the multi-instance setting. Experimental results show that multi-instance classifiers are very sensitive to the context in which they are used, and that the meta-learning approach can indeed yield useful insights in this respect.status: publishe

    Look before you leap: Some insights into learner evaluation with cross-validation

    No full text
    Machine learning is largely an experimental science, of which the evaluation of predictive models is an important aspect. These days, cross-validation is the most widely used method for this task. There are, however, a number of important points that should be taken into account when using this methodology. First, one should clearly state what they are trying to estimate. Namely, a distinction should be made between the evaluation of a model learned on a single dataset, and that of a learner trained on a random sample from a given data population. Each of these two questions requires a different statistical approach and should not be confused with each other. While this has been noted before, the literature on this topic is generally not very accessible. This paper tries to give an understandable overview of the statistical aspects of these two evaluation tasks. We also pose that because of the often limited availability of data, and the difficulty of selecting an appropriate statistical test, it is in some cases perhaps better to abstain from statistical testing, and instead focus on an interpretation of the immediate results.status: publishe

    Look before you leap: Some insights into learner evaluation with cross-validation (Poster)

    No full text
    Machine learning is largely an experimental science, of which the evaluation of predictive models is an important aspect. These days, cross-validation is the most widely used method for this task. There are, however, a number of important points that should be taken into account when using this methodology. First, one should clearly state what they are trying to estimate. Namely, a distinction should be made between the evaluation of a model learned on a single dataset, and that of a learner trained on a random sample from a given data population. Each of these two questions requires a different statistical approach and should not be confused with each other. While this has been noted before, the literature on this topic is generally not very accessible. This paper tries to give an understandable overview of the statistical aspects of these two evaluation tasks.status: publishe

    On estimating model accuracy with repeated cross-validation

    No full text
    Evaluation of predictive models is a ubiquitous task in machine learning and data mining. Cross-validation is often used as a means for evaluating models. There appears to be some confusion among researchers, however, about best practices for cross-validation, and about the interpretation of cross-validation results. In particular, repeated cross-validation is often advocated, and so is the reporting of standard deviations, confidence intervals, or an indication of "significance". In this paper, we argue that, under many practical circumstances, when the goal of the experiments is to see how well the model returned by a learner will perform in practice in a particular domain, repeated cross-validation is not useful, and the reporting of confidence intervals or significance is misleading. Our arguments are supported by experimental results.status: publishe

    Predicting the popularity of online articles with random forests

    No full text
    In this paper, we describe our submission to the predictive web analytics Discovery Challenge at ECML/PKDD 2014. The main goal of the challenge was to predict the number of visitors of a web page 48 hours in the future after observing this web page for an hour. An additional goal was to pre- dict the number of times the URL appeared in a tweet on Twitter and the number of times a Facebook message con- taining the URL was liked. We present an analysis of the time series data generated by the Chartbeat web analytics engine, which was made available for this competition, and the approach we used to predict page visits. Our model is based on random forest regression and learned on a set of features derived from the given time series data to capture the expected amount of visits, rate of change and temporal effect. Our approach won second place for predicting the number of visitors and the number of Facebook likes, and first place for predicting the number of tweets.status: publishe

    Meta-learning from an experiment database

    No full text
    In this short paper, we present a student project run as part of the Machine Learning and Inductive Inference course at KU Leuven during the 2010-2011 academic year. The goal of the project was to analyze a Machine Learning Experiment database, using standard SQL queries and data mining tools with the goals of (1) giving the students some practice with applying the machine learning techniques on a real problem, (2) teaching them something about the properties of machine learning algorithms and (3) training the students’ research skills by having them study literature on meta-learning to search for interesting background information and suggestions on how to approach the project and obtain meaningful results.status: publishe
    corecore