15 research outputs found
Generalized Wasserstein Dice Score, Distributionally Robust Deep Learning, and Ranger for brain tumor segmentation: BraTS 2020 challenge
Training a deep neural network is an optimization problem with four main
ingredients: the design of the deep neural network, the per-sample loss
function, the population loss function, and the optimizer. However, methods
developed to compete in recent BraTS challenges tend to focus only on the
design of deep neural network architectures, while paying less attention to the
three other aspects. In this paper, we experimented with adopting the opposite
approach. We stuck to a generic and state-of-the-art 3D U-Net architecture and
experimented with a non-standard per-sample loss function, the generalized
Wasserstein Dice loss, a non-standard population loss function, corresponding
to distributionally robust optimization, and a non-standard optimizer, Ranger.
Those variations were selected specifically for the problem of multi-class
brain tumor segmentation. The generalized Wasserstein Dice loss is a per-sample
loss function that allows taking advantage of the hierarchical structure of the
tumor regions labeled in BraTS. Distributionally robust optimization is a
generalization of empirical risk minimization that accounts for the presence of
underrepresented subdomains in the training dataset. Ranger is a generalization
of the widely used Adam optimizer that is more stable with small batch size and
noisy labels. We found that each of those variations of the optimization of
deep neural networks for brain tumor segmentation leads to improvements in
terms of Dice scores and Hausdorff distances. With an ensemble of three deep
neural networks trained with various optimization procedures, we achieved
promising results on the validation dataset of the BraTS 2020 challenge. Our
ensemble ranked fourth out of the 693 registered teams for the segmentation
task of the BraTS 2020 challenge.Comment: MICCAI 2020 BrainLes Workshop. Our method ranked fourth out of the
693 registered teams for the segmentation task of the BraTS 2020 challenge.
v2: Added some clarifications following reviewers' feedback (camera-ready
version
When Do Curricula Work in Federated Learning?
An oft-cited open problem of federated learning is the existence of data
heterogeneity at the clients. One pathway to understanding the drastic accuracy
drop in federated learning is by scrutinizing the behavior of the clients' deep
models on data with different levels of "difficulty", which has been left
unaddressed. In this paper, we investigate a different and rarely studied
dimension of FL: ordered learning. Specifically, we aim to investigate how
ordered learning principles can contribute to alleviating the heterogeneity
effects in FL. We present theoretical analysis and conduct extensive empirical
studies on the efficacy of orderings spanning three kinds of learning:
curriculum, anti-curriculum, and random curriculum. We find that curriculum
learning largely alleviates non-IIDness. Interestingly, the more disparate the
data distributions across clients the more they benefit from ordered learning.
We provide analysis explaining this phenomenon, specifically indicating how
curriculum training appears to make the objective landscape progressively less
convex, suggesting fast converging iterations at the beginning of the training
procedure. We derive quantitative results of convergence for both convex and
nonconvex objectives by modeling the curriculum training on federated devices
as local SGD with locally biased stochastic gradients. Also, inspired by
ordered learning, we propose a novel client selection technique that benefits
from the real-world disparity in the clients. Our proposed approach to client
selection has a synergic effect when applied together with ordered learning in
FL
IST Austria Thesis
Because of the increasing popularity of machine learning methods, it is becoming important to understand the impact of learned components on automated decision-making systems and to guarantee that their consequences are beneficial to society. In other words, it is necessary to ensure that machine learning is sufficiently trustworthy to be used in real-world applications. This thesis studies two properties of machine learning models that are highly desirable for the
sake of reliability: robustness and fairness. In the first part of the thesis we study the robustness of learning algorithms to training data corruption. Previous work has shown that machine learning models are vulnerable to a range
of training set issues, varying from label noise through systematic biases to worst-case data manipulations. This is an especially relevant problem from a present perspective, since modern machine learning methods are particularly data hungry and therefore practitioners often have to rely on data collected from various external sources, e.g. from the Internet, from app users or via crowdsourcing. Naturally, such sources vary greatly in the quality and reliability of the
data they provide. With these considerations in mind, we study the problem of designing machine learning algorithms that are robust to corruptions in data coming from multiple sources. We show that, in contrast to the case of a single dataset with outliers, successful learning within this model is possible both theoretically and practically, even under worst-case data corruptions. The second part of this thesis deals with fairness-aware machine learning. There are multiple areas where machine learning models have shown promising results, but where careful considerations are required, in order to avoid discrimanative decisions taken by such learned components. Ensuring fairness can be particularly challenging, because real-world training datasets are expected to contain various forms of historical bias that may affect the learning process. In this thesis we show that data corruption can indeed render the problem of achieving fairness impossible, by tightly characterizing the theoretical limits of fair learning under worst-case data manipulations. However, assuming access to clean data, we also show how fairness-aware learning can be made practical in contexts beyond binary classification, in particular in the challenging learning to rank setting