68 research outputs found
Automated Machine Learning for Multi-Label Classification
Automated machine learning (AutoML) aims to select and configure machine
learning algorithms and combine them into machine learning pipelines tailored
to a dataset at hand. For supervised learning tasks, most notably binary and
multinomial classification, aka single-label classification (SLC), such AutoML
approaches have shown promising results. However, the task of multi-label
classification (MLC), where data points are associated with a set of class
labels instead of a single class label, has received much less attention so
far. In the context of multi-label classification, the data-specific selection
and configuration of multi-label classifiers are challenging even for experts
in the field, as it is a high-dimensional optimization problem with multi-level
hierarchical dependencies. While for SLC, the space of machine learning
pipelines is already huge, the size of the MLC search space outnumbers the one
of SLC by several orders.
In the first part of this thesis, we devise a novel AutoML approach for
single-label classification tasks optimizing pipelines of machine learning
algorithms, consisting of two algorithms at most. This approach is then
extended first to optimize pipelines of unlimited length and eventually
configure the complex hierarchical structures of multi-label classification
methods. Furthermore, we investigate how well AutoML approaches that form the
state of the art for single-label classification tasks scale with the increased
problem complexity of AutoML for multi-label classification.
In the second part, we explore how methods for SLC and MLC could be
configured more flexibly to achieve better generalization performance and how
to increase the efficiency of execution-based AutoML systems
Meta-Learning for Automated Selection of Anomaly Detectors for Semi-Supervised Datasets
In anomaly detection, a prominent task is to induce a model to identify
anomalies learned solely based on normal data. Generally, one is interested in
finding an anomaly detector that correctly identifies anomalies, i.e., data
points that do not belong to the normal class, without raising too many false
alarms. Which anomaly detector is best suited depends on the dataset at hand
and thus needs to be tailored. The quality of an anomaly detector may be
assessed via confusion-based metrics such as the Matthews correlation
coefficient (MCC). However, since during training only normal data is available
in a semi-supervised setting, such metrics are not accessible. To facilitate
automated machine learning for anomaly detectors, we propose to employ
meta-learning to predict MCC scores based on metrics that can be computed with
normal data only. First promising results can be obtained considering the
hypervolume and the false positive rate as meta-features
Hyperparameter optimization in deep multi-target prediction
As a result of the ever increasing complexity of configuring and fine-tuning
machine learning models, the field of automated machine learning (AutoML) has
emerged over the past decade. However, software implementations like Auto-WEKA
and Auto-sklearn typically focus on classical machine learning (ML) tasks such
as classification and regression. Our work can be seen as the first attempt at
offering a single AutoML framework for most problem settings that fall under
the umbrella of multi-target prediction, which includes popular ML settings
such as multi-label classification, multivariate regression, multi-task
learning, dyadic prediction, matrix completion, and zero-shot learning.
Automated problem selection and model configuration are achieved by extending
DeepMTP, a general deep learning framework for MTP problem settings, with
popular hyperparameter optimization (HPO) methods. Our extensive benchmarking
across different datasets and MTP problem settings identifies cases where
specific HPO methods outperform others.Comment: 17 pages, 4 figures, 1 tabl
- …