54 research outputs found
CNN Feature Map Augmentation for Single-Source Domain Generalization
In search of robust and generalizable machine learning models, Domain
Generalization (DG) has gained significant traction during the past few years.
The goal in DG is to produce models which continue to perform well when
presented with data distributions different from the ones available during
training. While deep convolutional neural networks (CNN) have been able to
achieve outstanding performance on downstream computer vision tasks, they still
often fail to generalize on previously unseen data Domains. Therefore, in this
work we focus on producing a model which is able to remain robust under data
distribution shift and propose an alternative regularization technique for
convolutional neural network architectures in the single-source DG image
classification setting. To mitigate the problem caused by domain shift between
source and target data, we propose augmenting intermediate feature maps of
CNNs. Specifically, we pass them through a novel Augmentation Layer} to prevent
models from overfitting on the training set and improve their cross-domain
generalization. To the best of our knowledge, this is the first paper proposing
such a setup for the DG image classification setting. Experiments on the DG
benchmark datasets of PACS, VLCS, Office-Home and TerraIncognita validate the
effectiveness of our method, in which our model surpasses state-of-the-art
algorithms in most cases.Comment: In proceedings of IEEE BigDataService2023
(https://ieeebigdataservice.com/
Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization
During the past decade, deep neural networks have led to fast-paced progress
and significant achievements in computer vision problems, for both academia and
industry. Yet despite their success, state-of-the-art image classification
approaches fail to generalize well in previously unseen visual contexts, as
required by many real-world applications. In this paper, we focus on this
domain generalization (DG) problem and argue that the generalization ability of
deep convolutional neural networks can be improved by taking advantage of
multi-layer and multi-scaled representations of the network. We introduce a
framework that aims at improving domain generalization of image classifiers by
combining both low-level and high-level features at multiple scales, enabling
the network to implicitly disentangle representations in its latent space and
learn domain-invariant attributes of the depicted objects. Additionally, to
further facilitate robust representation learning, we propose a novel objective
function, inspired by contrastive learning, which aims at constraining the
extracted representations to remain invariant under distribution shifts. We
demonstrate the effectiveness of our method by evaluating on the domain
generalization datasets of PACS, VLCS, Office-Home and NICO. Through extensive
experimentation, we show that our model is able to surpass the performance of
previous DG methods and consistently produce competitive and state-of-the-art
results in all datasetsComment: Manuscript accepted in: IEEE Transactions on Artificial Intelligence
(March 2024
Towards Domain Generalization for ECG and EEG Classification: Algorithms and Benchmarks
Despite their immense success in numerous fields, machine and deep learning
systems have not yet been able to firmly establish themselves in
mission-critical applications in healthcare. One of the main reasons lies in
the fact that when models are presented with previously unseen,
Out-of-Distribution samples, their performance deteriorates significantly. This
is known as the Domain Generalization (DG) problem. Our objective in this work
is to propose a benchmark for evaluating DG algorithms, in addition to
introducing a novel architecture for tackling DG in biosignal classification.
In this paper, we describe the Domain Generalization problem for biosignals,
focusing on electrocardiograms (ECG) and electroencephalograms (EEG) and
propose and implement an open-source biosignal DG evaluation benchmark.
Furthermore, we adapt state-of-the-art DG algorithms from computer vision to
the problem of 1D biosignal classification and evaluate their effectiveness.
Finally, we also introduce a novel neural network architecture that leverages
multi-layer representations for improved model generalizability. By
implementing the above DG setup we are able to experimentally demonstrate the
presence of the DG problem in ECG and EEG datasets. In addition, our proposed
model demonstrates improved effectiveness compared to the baseline algorithms,
exceeding the state-of-the-art in both datasets. Recognizing the significance
of the distribution shift present in biosignal datasets, the presented
benchmark aims at urging further research into the field of biomedical DG by
simplifying the evaluation process of proposed algorithms. To our knowledge,
this is the first attempt at developing an open-source framework for evaluating
ECG and EEG DG algorithms.Comment: Accepted in IEEE Transactions on Emerging Topics in Computational
Intelligenc
DALE: Differential Accumulated Local Effects for efficient and accurate global explanations
Accumulated Local Effect (ALE) is a method for accurately estimating feature
effects, overcoming fundamental failure modes of previously-existed methods,
such as Partial Dependence Plots. However, ALE's approximation, i.e. the method
for estimating ALE from the limited samples of the training set, faces two
weaknesses. First, it does not scale well in cases where the input has high
dimensionality, and, second, it is vulnerable to out-of-distribution (OOD)
sampling when the training set is relatively small. In this paper, we propose a
novel ALE approximation, called Differential Accumulated Local Effects (DALE),
which can be used in cases where the ML model is differentiable and an
auto-differentiable framework is accessible. Our proposal has significant
computational advantages, making feature effect estimation applicable to
high-dimensional Machine Learning scenarios with near-zero computational
overhead. Furthermore, DALE does not create artificial points for calculating
the feature effect, resolving misleading estimations due to OOD sampling.
Finally, we formally prove that, under some hypotheses, DALE is an unbiased
estimator of ALE and we present a method for quantifying the standard error of
the explanation. Experiments using both synthetic and real datasets demonstrate
the value of the proposed approach.Comment: 16 pages, to be published in Asian Conference of Machine Learning
(ACML) 202
C-XGBoost: A tree boosting model for causal effect estimation
Causal effect estimation aims at estimating the Average Treatment Effect as
well as the Conditional Average Treatment Effect of a treatment to an outcome
from the available data. This knowledge is important in many safety-critical
domains, where it often needs to be extracted from observational data. In this
work, we propose a new causal inference model, named C-XGBoost, for the
prediction of potential outcomes. The motivation of our approach is to exploit
the superiority of tree-based models for handling tabular data together with
the notable property of causal inference neural network-based models to learn
representations that are useful for estimating the outcome for both the
treatment and non-treatment cases. The proposed model also inherits the
considerable advantages of XGBoost model such as efficiently handling features
with missing values requiring minimum preprocessing effort, as well as it is
equipped with regularization techniques to avoid overfitting/bias. Furthermore,
we propose a new loss function for efficiently training the proposed causal
inference model. The experimental analysis, which is based on the performance
profiles of Dolan and Mor{\'e} as well as on post-hoc and non-parametric
statistical tests, provide strong evidence about the effectiveness of the
proposed approach.Comment: This paper has been accepted for presentation at IFIP International
Conference on Artificial Intelligence Applications and Innovation
Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning
Despite the recent increase in research activity, deep-learning models have
not yet been widely accepted in several real-world settings, such as medicine.
The shortage of high-quality annotated data often hinders the development of
robust and generalizable models, which do not suffer from degraded
effectiveness when presented with out-of-distribution (OOD) datasets.
Contrastive Self-Supervised Learning (SSL) offers a potential solution to
labeled data scarcity, as it takes advantage of unlabeled data to increase
model effectiveness and robustness. However, the selection of appropriate
transformations during the learning process is not a trivial task and even
breaks down the ability of the network to extract meaningful information. In
this research, we propose uncovering the optimal augmentations for applying
contrastive learning in 1D phonocardiogram (PCG) classification. We perform an
extensive comparative evaluation of a wide range of audio-based augmentations,
evaluate models on multiple datasets across downstream tasks, and report on the
impact of each augmentation. We demonstrate that depending on its training
distribution, the effectiveness of a fully-supervised model can degrade up to
32%, while SSL models only lose up to 10% or even improve in some cases. We
argue and experimentally demonstrate that, contrastive SSL pretraining can
assist in providing robust classifiers which can generalize to unseen, OOD
data, without relying on time- and labor-intensive annotation processes by
medical experts. Furthermore, the proposed evaluation protocol sheds light on
the most promising and appropriate augmentations for robust PCG signal
processing, by calculating their effect size on model training. Finally, we
provide researchers and practitioners with a roadmap towards producing robust
models for PCG classification, in addition to an open-source codebase for
developing novel approaches.Comment: PREPRINT Manuscript under revie
Sampling Strategies for Mitigating Bias in Face Synthesis Methods
Synthetically generated images can be used to create media content or to
complement datasets for training image analysis models. Several methods have
recently been proposed for the synthesis of high-fidelity face images; however,
the potential biases introduced by such methods have not been sufficiently
addressed. This paper examines the bias introduced by the widely popular
StyleGAN2 generative model trained on the Flickr Faces HQ dataset and proposes
two sampling strategies to balance the representation of selected attributes in
the generated face images. We focus on two protected attributes, gender and
age, and reveal that biases arise in the distribution of randomly sampled
images against very young and very old age groups, as well as against female
faces. These biases are also assessed for different image quality levels based
on the GIQA score. To mitigate bias, we propose two alternative methods for
sampling on selected lines or spheres of the latent space to increase the
number of generated samples from the under-represented classes. The
experimental results show a decrease in bias against underrepresented groups
and a more uniform distribution of the protected features at different levels
of image quality.Comment: Accepted to the BIAS 2023 ECML-PKDD Worksho
- …