53,067 research outputs found
Calibrating Ensembles for Scalable Uncertainty Quantification in Deep Learning-based Medical Segmentation
Uncertainty quantification in automated image analysis is highly desired in
many applications. Typically, machine learning models in classification or
segmentation are only developed to provide binary answers; however, quantifying
the uncertainty of the models can play a critical role for example in active
learning or machine human interaction. Uncertainty quantification is especially
difficult when using deep learning-based models, which are the state-of-the-art
in many imaging applications. The current uncertainty quantification approaches
do not scale well in high-dimensional real-world problems. Scalable solutions
often rely on classical techniques, such as dropout, during inference or
training ensembles of identical models with different random seeds to obtain a
posterior distribution. In this paper, we show that these approaches fail to
approximate the classification probability. On the contrary, we propose a
scalable and intuitive framework to calibrate ensembles of deep learning models
to produce uncertainty quantification measurements that approximate the
classification probability. On unseen test data, we demonstrate improved
calibration, sensitivity (in two out of three cases) and precision when being
compared with the standard approaches. We further motivate the usage of our
method in active learning, creating pseudo-labels to learn from unlabeled
images and human-machine collaboration
Evolving Ensembles with TPOT
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceMachine learning has become popular in recent years as a solution to various problems such as fraud detection, weather prediction, improve diagnosis accuracy, and more. One of its goals is to find the model that best explains the problem. Among the several alternatives on how to accomplish that, significant attention has been laid on the matter of accuracy using stacking ensembles: the objective is to produce a more accurate prediction by combining the predictions of various estimators. This model has often been exhibiting a superior performance in contrast to its single counterparts. Because the process of choosing the best model for a given problem can be time-consuming, a necessity to automatize the machine learning process has emerged. Different tools allow this, including TPOT, a Python library that uses genetic programming to optimize the machine learning process, evolving pipelines randomly created until the best one is found, or a previously fixed maximum number of generations for the given problem is reached. Genetic programming is a field of machine learning that uses evolutionary algorithms to generate new computer programs, and it has been shown successful in quite a few applications. TPOT uses several machine learning algorithms from the Sklearn Python library. It also features some ensembles, such as Random Forest or AdaBoost. Currently, stacking ensembles are not implemented yet on TPOT, and, considering its current accuracy rates, the objective of this thesis is to implement stacking ensembles in TPOT. After we implemented stacking ensembles successfully in TPOT, we performed some experiments with different datasets and noticed that for almost all of them, TPOT has comparable performance to TPOT with stacking ensembles. Also, we observed that, when using the light dictionary version of TPOT, the results of the Stacking configuration improved for two datasets since it used weaker learners
Comparison of standard resampling methods for performance estimation of artificial neural network ensembles
Estimation of the generalization performance for classification within the medical applications domain is always an important task. In this study we focus on artificial neural network ensembles as the machine learning technique. We present a numerical comparison between five common resampling techniques: k-fold cross validation (CV), holdout, using three cutoffs, and bootstrap using five different data sets. The results show that CV together with holdout and are the best resampling strategies for estimating the true performance of ANN ensembles. The bootstrap, using the .632+ rule, is too optimistic, while the holdout underestimates the true performance
The Future of Human-AI Collaboration: A Taxonomy of Design Knowledge for Hybrid Intelligence Systems
Recent technological advances, especially in the field of machine learning, provide astonishing progress on the road towards artificial general intelligence. However, tasks in current real-world business applications cannot yet be solved by machines alone. We, therefore, identify the need for developing socio-technological ensembles of humans and machines. Such systems possess the ability to accomplish complex goals by combining human and artificial intelligence to collectively achieve superior results and continuously improve by learning from each other. Thus, the need for structured design knowledge for those systems arises. Following a taxonomy development method, this article provides three main contributions: First, we present a structured overview of interdisciplinary research on the role of humans in the machine learning pipeline. Second, we envision hybrid intelligence systems and conceptualize the relevant dimensions for system design for the first time. Finally, we offer useful guidance for system developers during the implementation of such applications
Simple Regularisation for Uncertainty-Aware Knowledge Distillation
Considering uncertainty estimation of modern neural networks (NNs) is one of
the most important steps towards deploying machine learning systems to
meaningful real-world applications such as in medicine, finance or autonomous
systems. At the moment, ensembles of different NNs constitute the
state-of-the-art in both accuracy and uncertainty estimation in different
tasks. However, ensembles of NNs are unpractical under real-world constraints,
since their computation and memory consumption scale linearly with the size of
the ensemble, which increase their latency and deployment cost. In this work,
we examine a simple regularisation approach for distribution-free knowledge
distillation of ensemble of machine learning models into a single NN. The aim
of the regularisation is to preserve the diversity, accuracy and uncertainty
estimation characteristics of the original ensemble without any intricacies,
such as fine-tuning. We demonstrate the generality of the approach on
combinations of toy data, SVHN/CIFAR-10, simple to complex NN architectures and
different tasks
- …