275 research outputs found
A review of Generative Adversarial Networks for Electronic Health Records: applications, evaluation measures and data sources
Electronic Health Records (EHRs) are a valuable asset to facilitate clinical
research and point of care applications; however, many challenges such as data
privacy concerns impede its optimal utilization. Deep generative models,
particularly, Generative Adversarial Networks (GANs) show great promise in
generating synthetic EHR data by learning underlying data distributions while
achieving excellent performance and addressing these challenges. This work aims
to review the major developments in various applications of GANs for EHRs and
provides an overview of the proposed methodologies. For this purpose, we
combine perspectives from healthcare applications and machine learning
techniques in terms of source datasets and the fidelity and privacy evaluation
of the generated synthetic datasets. We also compile a list of the metrics and
datasets used by the reviewed works, which can be utilized as benchmarks for
future research in the field. We conclude by discussing challenges in GANs for
EHRs development and proposing recommended practices. We hope that this work
motivates novel research development directions in the intersection of
healthcare and machine learning
Landmarks Augmentation with Manifold-Barycentric Oversampling
The training of Generative Adversarial Networks (GANs) requires a large
amount of data, stimulating the development of new augmentation methods to
alleviate the challenge. Oftentimes, these methods either fail to produce
enough new data or expand the dataset beyond the original manifold. In this
paper, we propose a new augmentation method that guarantees to keep the new
data within the original data manifold thanks to the optimal transport theory.
The proposed algorithm finds cliques in the nearest-neighbors graph and, at
each sampling iteration, randomly draws one clique to compute the Wasserstein
barycenter with random uniform weights. These barycenters then become the new
natural-looking elements that one could add to the dataset. We apply this
approach to the problem of landmarks detection and augment the available
annotation in both unpaired and in semi-supervised scenarios. Additionally, the
idea is validated on cardiac data for the task of medical segmentation. Our
approach reduces the overfitting and improves the quality metrics beyond the
original data outcome and beyond the result obtained with popular modern
augmentation methods.Comment: 11 pages, 4 figures, 3 tables. I.B. and N.B. contributed equally.
D.V.D. is the corresponding autho
Synthetic Observational Health Data with GANs: from slow adoption to a boom in medical research and ultimately digital twins?
After being collected for patient care, Observational Health Data (OHD) can
further benefit patient well-being by sustaining the development of health
informatics and medical research. Vast potential is unexploited because of the
fiercely private nature of patient-related data and regulations to protect it.
Generative Adversarial Networks (GANs) have recently emerged as a
groundbreaking way to learn generative models that produce realistic synthetic
data. They have revolutionized practices in multiple domains such as
self-driving cars, fraud detection, digital twin simulations in industrial
sectors, and medical imaging.
The digital twin concept could readily apply to modelling and quantifying
disease progression. In addition, GANs posses many capabilities relevant to
common problems in healthcare: lack of data, class imbalance, rare diseases,
and preserving privacy. Unlocking open access to privacy-preserving OHD could
be transformative for scientific research. In the midst of COVID-19, the
healthcare system is facing unprecedented challenges, many of which of are data
related for the reasons stated above.
Considering these facts, publications concerning GAN applied to OHD seemed to
be severely lacking. To uncover the reasons for this slow adoption, we broadly
reviewed the published literature on the subject. Our findings show that the
properties of OHD were initially challenging for the existing GAN algorithms
(unlike medical imaging, for which state-of-the-art model were directly
transferable) and the evaluation synthetic data lacked clear metrics.
We find more publications on the subject than expected, starting slowly in
2017, and since then at an increasing rate. The difficulties of OHD remain, and
we discuss issues relating to evaluation, consistency, benchmarking, data
modelling, and reproducibility.Comment: 31 pages (10 in previous version), not including references and
glossary, 51 in total. Inclusion of a large number of recent publications and
expansion of the discussion accordingl
A survey of generative adversarial networks for synthesizing structured electronic health records
Electronic Health Records (EHRs) are a valuable asset to facilitate clinical research and point of care applications; however, many challenges such as data privacy concerns impede its optimal utilization. Deep generative models, particularly, Generative Adversarial Networks (GANs) show great promise in generating synthetic EHR data by learning underlying data distributions while achieving excellent performance and addressing these challenges. This work aims to survey the major developments in various applications of GANs for EHRs and provides an overview of the proposed methodologies. For this purpose, we combine perspectives from healthcare applications and machine learning techniques in terms of source datasets and the fidelity and privacy evaluation of the generated synthetic datasets. We also compile a list of the metrics and datasets used by the reviewed works, which can be utilized as benchmarks for future research in the field. We conclude by discussing challenges in GANs for EHRs development and proposing recommended practices. We hope that this work motivates novel research development directions in the intersection of healthcare and machine learning
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
Open set learning with augmented category by exploiting unlabelled data (open-LACU)
Considering the nature of unlabelled data, it is common for partially
labelled training datasets to contain samples that belong to novel categories.
Although these so-called observed novel categories exist in the training data,
they do not belong to any of the training labels. In contrast, open-sets define
novel categories as those unobserved during during training, but present during
testing. This research is the first to generalize between observed and
unobserved novel categories within a new learning policy called open-set
learning with augmented category by exploiting unlabeled data or open-LACU.
This study conducts a high-level review on novelty detection so to
differentiate between research fields that concern observed novel categories,
and the research fields that concern unobserved novel categories. Open-LACU is
then introduced as a synthesis of the relevant fields to maintain the
advantages of each within a single learning policy. Currently, we are
finalising the first open-LACU network which will be combined with this
pre-print to be sent for publication.Comment: 11 Page
- …