13 research outputs found
Improving generalisation of AutoML systems with dynamic fitness evaluations
A common problem machine learning developers are faced with is overfitting,
that is, fitting a pipeline too closely to the training data that the
performance degrades for unseen data. Automated machine learning aims to free
(or at least ease) the developer from the burden of pipeline creation, but this
overfitting problem can persist. In fact, this can become more of a problem as
we look to iteratively optimise the performance of an internal cross-validation
(most often \textit{k}-fold). While this internal cross-validation hopes to
reduce this overfitting, we show we can still risk overfitting to the
particular folds used. In this work, we aim to remedy this problem by
introducing dynamic fitness evaluations which approximate repeated
\textit{k}-fold cross-validation, at little extra cost over single
\textit{k}-fold, and far lower cost than typical repeated \textit{k}-fold. The
results show that when time equated, the proposed fitness function results in
significant improvement over the current state-of-the-art baseline method which
uses an internal single \textit{k}-fold. Furthermore, the proposed extension is
very simple to implement on top of existing evolutionary computation methods,
and can provide essentially a free boost in generalisation/testing performance.Comment: 19 pages, 4 figure
Recognizing Affiliation: Using Behavioural Traces to Predict the Quality of Social Interactions in Online Games
Online social interactions in multiplayer games can be supportive and
positive or toxic and harmful; however, few methods can easily assess
interpersonal interaction quality in games. We use behavioural traces to
predict affiliation between dyadic strangers, facilitated through their social
interactions in an online gaming setting. We collected audio, video, in-game,
and self-report data from 23 dyads, extracted 75 features, trained Random
Forest and Support Vector Machine models, and evaluated their performance
predicting binary (high/low) as well as continuous affiliation toward a
partner. The models can predict both binary and continuous affiliation with up
to 79.1% accuracy (F1) and 20.1% explained variance (R2) on unseen data, with
features based on verbal communication demonstrating the highest potential. Our
findings can inform the design of multiplayer games and game communities, and
guide the development of systems for matchmaking and mitigating toxic behaviour
in online games.Comment: CHI '2
Towards Declarative Statistical Inference
Wide-ranging digitalization has made it possible to capture increasingly larger amounts of data. In order to transform this raw data into meaningful insights, data analytics and statistical inference techniques are essential. However, while it is expected that a researcher is an expert in their own field, it is not self-evident that they are also proficient in statistics. In fact, it is known that statistical inference is a labor-intensive and error-prone task. This dissertation aims to understand current statistical inference practices for the experimental evaluation of machine learning algorithms, and proposes improvements where possible. It takes a small step forward towards the goal of automating the data analysis component of empirical research, making the process more robust in terms of correct execution and interpretation of the results.
Our first contribution is a synthesis of existing knowledge about error estimation of supervised learning algorithms with cross-validation. We highlight the distinction between model and learner error, and investigate the effect of repeating cross-validation on the quality of the error estimate.
Next, we focus on the evaluation of multi-instance learning algorithms. Here, instances are not labeled individually, but instead are grouped together in bags and only the bag label is known. Our second contribution is an investigation of the extent to which conclusions about bag-level performance can be generalized to the instance-level. Our third contribution is a meta-learning experiment in which we predict the most suitable multi-instance learner for a given problem.
The intricate nature of statistical inference begs the question whether this aspect of research cannot be automated. One requirement for this is the availability of a model representing all relevant characteristics of the population under study. Bayesian networks are a candidate for this, as they concisely describe the joint probability distribution of a set of random variables, and come with a plethora of efficient inference methods. Our last contribution is a theoretical proposal of a greedy-hill climbing structure learning algorithm for Bayesian networks.status: publishe
A declarative query language for statistical inference
We present a preliminary design of an experimentation system that consists of a declarative language and an inference engine. The language allows to formulate a hypothesis about a data population, whereafter the inference engine automatically provides an answer, based on a limited sample.status: publishe
A meta-learning system for multi-instance classification
Meta-learning refers to the use of machine learning methods to analyze the behavior of machine learning methods on different types of datasets. Until now, meta-learning has mostly focused on the standard classification setting. In this paper about ongoing work, we apply it to multi-instance classification, an alternative classification setting in which bags of instances, rather than individual instances, are labeled. We define a number of data set properties that are specific to the multi-instance setting, and extend the concept of landmarkers to the multi-instance setting. Experimental results show that multi-instance classifiers are very sensitive to the context in which they are used, and that the meta-learning approach can indeed yield useful insights in this respect.status: publishe
Look before you leap: Some insights into learner evaluation with cross-validation
Machine learning is largely an experimental science, of which the evaluation of predictive
models is an important aspect. These days, cross-validation is the most widely used method
for this task. There are, however, a number of important points that should be taken into
account when using this methodology. First, one should clearly state what they are trying to
estimate. Namely, a distinction should be made between the evaluation of a model learned
on a single dataset, and that of a learner trained on a random sample from a given data
population. Each of these two questions requires a different statistical approach and should
not be confused with each other. While this has been noted before, the literature on this
topic is generally not very accessible. This paper tries to give an understandable overview
of the statistical aspects of these two evaluation tasks. We also pose that because of the
often limited availability of data, and the difficulty of selecting an appropriate statistical
test, it is in some cases perhaps better to abstain from statistical testing, and instead focus
on an interpretation of the immediate results.status: publishe
Look before you leap: Some insights into learner evaluation with cross-validation (Poster)
Machine learning is largely an experimental science, of which the evaluation of predictive
models is an important aspect. These days, cross-validation is the most widely used method
for this task. There are, however, a number of important points that should be taken into
account when using this methodology. First, one should clearly state what they are trying to
estimate. Namely, a distinction should be made between the evaluation of a model learned
on a single dataset, and that of a learner trained on a random sample from a given data
population. Each of these two questions requires a different statistical approach and should
not be confused with each other. While this has been noted before, the literature on this
topic is generally not very accessible. This paper tries to give an understandable overview
of the statistical aspects of these two evaluation tasks.status: publishe
On estimating model accuracy with repeated cross-validation
Evaluation of predictive models is a ubiquitous task in machine learning and data mining. Cross-validation is often used as a means for evaluating models. There appears to be some confusion among researchers, however, about best practices for cross-validation, and about the interpretation of cross-validation results. In particular, repeated cross-validation is often advocated, and so is the reporting of standard deviations, confidence intervals, or an indication of "significance". In this paper, we argue that, under many practical circumstances, when the goal of the experiments is to see how well the model returned by a learner will perform in practice in a particular domain, repeated cross-validation is not useful, and the reporting of confidence intervals or significance is misleading. Our arguments are supported by experimental results.status: publishe
Predicting the popularity of online articles with random forests
In this paper, we describe our submission to the predictive
web analytics Discovery Challenge at ECML/PKDD 2014.
The main goal of the challenge was to predict the number of
visitors of a web page 48 hours in the future after observing
this web page for an hour. An additional goal was to pre-
dict the number of times the URL appeared in a tweet on
Twitter and the number of times a Facebook message con-
taining the URL was liked. We present an analysis of the
time series data generated by the Chartbeat web analytics
engine, which was made available for this competition, and
the approach we used to predict page visits. Our model is
based on random forest regression and learned on a set of
features derived from the given time series data to capture
the expected amount of visits, rate of change and temporal
effect. Our approach won second place for predicting the
number of visitors and the number of Facebook likes, and
first place for predicting the number of tweets.status: publishe
Meta-learning from an experiment database
In this short paper, we present a student project run as part of the Machine Learning and Inductive Inference course at KU Leuven during the 2010-2011 academic year. The goal of the project was to analyze a Machine Learning Experiment database, using standard SQL queries and data mining tools
with the goals of (1) giving the students some practice with applying the machine learning techniques on a real problem, (2) teaching them something about the properties of machine learning algorithms and (3) training the students’ research skills by having them study literature on meta-learning to search for interesting background information and suggestions on how to approach the project and obtain meaningful results.status: publishe