24 research outputs found
Privacy Distillation:Reducing Re-identification Risk of Multimodal Diffusion Models
Knowledge distillation in neural networks refers to compressing a large model or dataset into a smaller version of itself. We introduce Privacy Distillation, a framework that allows a text-to-image generative model to teach another model without exposing it to identifiable data. Here, we are interested in the privacy issue faced by a data provider who wishes to share their data via a multimodal generative model. A question that immediately arises is ``How can a data provider ensure that the generative model is not leaking identifiable information about a patient?''. Our solution consists of (1) training a first diffusion model on real data (2) generating a synthetic dataset using this model and filtering it to exclude images with a re-identifiability risk (3) training a second diffusion model on the filtered synthetic data only. We showcase that datasets sampled from models trained with privacy distillation can effectively reduce re-identification risk whilst maintaining downstream performance
Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models
Knowledge distillation in neural networks refers to compressing a large model
or dataset into a smaller version of itself. We introduce Privacy Distillation,
a framework that allows a text-to-image generative model to teach another model
without exposing it to identifiable data. Here, we are interested in the
privacy issue faced by a data provider who wishes to share their data via a
multimodal generative model. A question that immediately arises is ``How can a
data provider ensure that the generative model is not leaking identifiable
information about a patient?''. Our solution consists of (1) training a first
diffusion model on real data (2) generating a synthetic dataset using this
model and filtering it to exclude images with a re-identifiability risk (3)
training a second diffusion model on the filtered synthetic data only. We
showcase that datasets sampled from models trained with privacy distillation
can effectively reduce re-identification risk whilst maintaining downstream
performance
A 3D generative model of pathological multi-modal MR images and segmentations
Generative modelling and synthetic data can be a surrogate for real medical
imaging datasets, whose scarcity and difficulty to share can be a nuisance when
delivering accurate deep learning models for healthcare applications. In recent
years, there has been an increased interest in using these models for data
augmentation and synthetic data sharing, using architectures such as generative
adversarial networks (GANs) or diffusion models (DMs). Nonetheless, the
application of synthetic data to tasks such as 3D magnetic resonance imaging
(MRI) segmentation remains limited due to the lack of labels associated with
the generated images. Moreover, many of the proposed generative MRI models lack
the ability to generate arbitrary modalities due to the absence of explicit
contrast conditioning. These limitations prevent the user from adjusting the
contrast and content of the images and obtaining more generalisable data for
training task-specific models. In this work, we propose brainSPADE3D, a 3D
generative model for brain MRI and associated segmentations, where the user can
condition on specific pathological phenotypes and contrasts. The proposed joint
imaging-segmentation generative model is shown to generate high-fidelity
synthetic images and associated segmentations, with the ability to combine
pathologies. We demonstrate how the model can alleviate issues with
segmentation model performance when unexpected pathologies are present in the
data.Comment: Accepted for publication at the 2023 Deep Generative Models
(DGM4MICCAI) MICCAI workshop (Vancouver, Canada
Regional dynamics of the resting brain in amyotrophic lateral sclerosis using fALFF and ReHo analyses
Resting state functional magnetic resonance imaging (rs-fMRI) has been playing an important role in the study of amyotrophic lateral sclerosis (ALS). Although functional connectivity is widely studied, the patterns of spontaneous neural activity of the resting brain are important mechanisms that have been used recently to study a variety of conditions but remain less explored in ALS. Here we have used fractional amplitude of low-frequency fluctuations (fALFF) and regional homogeneity (ReHo) to study the regional dynamics of the resting brain of non-demented ALS patients compared with healthy controls. As expected, we found the sensorimotor network (SMN) with changes in fALFF and ReHo but also found the default mode (DMN), frontoparietal (FPN), salience (SN) networks altered and the cerebellum, although no structural changes between ALS patients and controls were reported in the regions with fALFF and ReHo changes. We show an altered pattern in the spontaneous low frequency oscillations that is not confined to the motor areas and reveal a more widespread involvement of non-motor regions, including those responsible for cognition
Federated learning for medical imaging radiology
Federated learning (FL) is gaining wide acceptance across the medical AI domains. FL promises to provide a fairly acceptable clinical-grade accuracy, privacy, and generalisability of machine learning models across multiple institutions. However, the research on FL for medical imaging AI is still in its early stages. This paper presents a review of recent research to outline the difference between state-of-the-art [SOTA] (published literature) and state-of-the-practice [SOTP] (applied research in realistic clinical environments). Furthermore, the review outlines the future research directions considering various factors such as data, learning models, system design, governance, and human-in-loop to translate the SOTA into SOTP and effectively collaborate across multiple institutions
Can segmentation models be trained with fully synthetically generated data?
In order to achieve good performance and generalisability, medical image
segmentation models should be trained on sizeable datasets with sufficient
variability. Due to ethics and governance restrictions, and the costs
associated with labelling data, scientific development is often stifled, with
models trained and tested on limited data. Data augmentation is often used to
artificially increase the variability in the data distribution and improve
model generalisability. Recent works have explored deep generative models for
image synthesis, as such an approach would enable the generation of an
effectively infinite amount of varied data, addressing the generalisability and
data access problems. However, many proposed solutions limit the user's control
over what is generated. In this work, we propose brainSPADE, a model which
combines a synthetic diffusion-based label generator with a semantic image
generator. Our model can produce fully synthetic brain labels on-demand, with
or without pathology of interest, and then generate a corresponding MRI image
of an arbitrary guided style. Experiments show that brainSPADE synthetic data
can be used to train segmentation models with performance comparable to that of
models trained on real data.Comment: 12 pages, 2 (+2 App.) figures, 3 tables. Accepted at Simulation and
Synthesis in Medical Imaging workshop (MICCAI 2022
Papez circuit gray matter and episodic memory in amyotrophic lateral sclerosis and behavioural variant frontotemporal dementia
Amyotrophic lateral sclerosis and behavioural variant frontotemporal dementia are two different diseases recognized to overlap at clinical, pathological and genetic characteristics. Both conditions are traditionally known for relative sparing of episodic memory. However, recent studies have disputed that with the report of patients presenting with marked episodic memory impairment. Besides that, structural and functional changes in temporal lobe regions responsible for episodic memory processing are often detected in neuroimaging studies of both conditions. In this study, we investigated the gray matter features associated with the Papez circuit in amyotrophic lateral sclerosis, behavioural variant frontotemporal dementia and healthy controls to further explore similarities and differences between the two conditions. Our non-demented amyotrophic lateral sclerosis patients showed no episodic memory deficits measured by a short-term delayed recall test while no changes in gray matter of the Papez circuit were found. Compared with the amyotrophic lateral sclerosis group, the behavioural variant frontotemporal dementia group had lower performance on the short-term delayed recall test and marked atrophy in gray matter of the Papez circuit. Bilateral atrophy of entorhinal cortex and mammillary bodies distinguished behavioural variant frontotemporal dementia from amyotrophic lateral sclerosis patients as well as atrophy in left cingulate, left hippocampus and right parahippocampal gyrus. Taken together, our results suggest that sub-regions of the Papez circuit could be differently affected in amyotrophic lateral sclerosis and behavioural variant frontotemporal dementia
Morphology-preserving Autoregressive 3D Generative Modelling of the Brain
Human anatomy, morphology, and associated diseases can be studied using
medical imaging data. However, access to medical imaging data is restricted by
governance and privacy concerns, data ownership, and the cost of acquisition,
thus limiting our ability to understand the human body. A possible solution to
this issue is the creation of a model able to learn and then generate synthetic
images of the human body conditioned on specific characteristics of relevance
(e.g., age, sex, and disease status). Deep generative models, in the form of
neural networks, have been recently used to create synthetic 2D images of
natural scenes. Still, the ability to produce high-resolution 3D volumetric
imaging data with correct anatomical morphology has been hampered by data
scarcity and algorithmic and computational limitations. This work proposes a
generative model that can be scaled to produce anatomically correct,
high-resolution, and realistic images of the human brain, with the necessary
quality to allow further downstream analyses. The ability to generate a
potentially unlimited amount of data not only enables large-scale studies of
human anatomy and pathology without jeopardizing patient privacy, but also
significantly advances research in the field of anomaly detection, modality
synthesis, learning under limited data, and fair and ethical AI. Code and
trained models are available at: https://github.com/AmigoLab/SynthAnatomy.Comment: 13 pages, 3 figures, 2 tables, accepted at SASHIMI MICCAI 202
Transformer-based out-of-distribution detection for clinically safe segmentation
In a clinical setting it is essential that deployed image processing systems are robust to the full range of inputs they might encounter and, in particular, do not make confidently wrong predictions. The most popular approach to safe processing is to train networks that can provide a measure of their uncertainty, but these tend to fail for inputs that are far outside the training data distribution. Recently, generative modelling approaches have been proposed as an alternative; these can quantify the likelihood of a data sample explicitly, filtering out any out-of-distribution (OOD) samples before further processing is performed. In this work, we focus on image segmentation and evaluate several approaches to network uncertainty in the far-OOD and near-OOD cases for the task of segmenting haemorrhages in head CTs. We find all of these approaches are unsuitable for safe segmentation as they provide confidently wrong predictions when operating OOD. We propose performing full 3D OOD detection using a VQ-GAN to provide a compressed latent representation of the image and a transformer to estimate the data likelihood. Our approach successfully identifies images in both the far- and near-OOD cases. We find a strong relationship between image likelihood and the quality of a model鈥檚 segmentation, making this approach viable for filtering images unsuitable for segmentation. To our knowledge, this is the first time transformers have been applied to perform OOD detection on 3D image data.</p
Latent Transformer Models for out-of-distribution detection
Any clinically-deployed image-processing pipeline must be robust to the full range of inputs it may be presented with. One popular approach to this challenge is to develop predictive models that can provide a measure of their uncertainty. Another approach is to use generative modelling to quantify the likelihood of inputs. Inputs with a low enough likelihood are deemed to be out-of-distribution and are not presented to the downstream predictive model. In this work, we evaluate several approaches to segmentation with uncertainty for the task of segmenting bleeds in 3D CT of the head. We show that these models can fail catastrophically when operating in the far out-of-distribution domain, often providing predictions that are both highly confident and wrong. We propose to instead perform out-of-distribution detection using the Latent Transformer Model: a VQ-GAN is used to provide a highly compressed latent representation of the input volume, and a transformer is then used to estimate the likelihood of this compressed representation of the input. We demonstrate this approach can identify images that are both far- and near- out-of-distribution, as well as provide spatial maps that highlight the regions considered to be out-of-distribution. Furthermore, we find a strong relationship between an image's likelihood and the quality of a model's segmentation on it, demonstrating that this approach is viable for filtering out unsuitable images