85 research outputs found
Three-dimensional Bone Image Synthesis with Generative Adversarial Networks
Medical image processing has been highlighted as an area where deep
learning-based models have the greatest potential. However, in the medical
field in particular, problems of data availability and privacy are hampering
research progress and thus rapid implementation in clinical routine. The
generation of synthetic data not only ensures privacy, but also allows to
\textit{draw} new patients with specific characteristics, enabling the
development of data-driven models on a much larger scale. This work
demonstrates that three-dimensional generative adversarial networks (GANs) can
be efficiently trained to generate high-resolution medical volumes with finely
detailed voxel-based architectures. In addition, GAN inversion is successfully
implemented for the three-dimensional setting and used for extensive research
on model interpretability and applications such as image morphing, attribute
editing and style mixing. The results are comprehensively validated on a
database of three-dimensional HR-pQCT instances representing the bone
micro-architecture of the distal radius.Comment: Submitted to the journal Artificial Intelligence in Medicin
Generating Realistic Counterfactuals for Retinal Fundus and OCT Images using Diffusion Models
Counterfactual reasoning is often used in clinical settings to explain
decisions or weigh alternatives. Therefore, for imaging based specialties such
as ophthalmology, it would be beneficial to be able to create counterfactual
images, illustrating answers to questions like "If the subject had had diabetic
retinopathy, how would the fundus image have looked?". Here, we demonstrate
that using a diffusion model in combination with an adversarially robust
classifier trained on retinal disease classification tasks enables the
generation of highly realistic counterfactuals of retinal fundus images and
optical coherence tomography (OCT) B-scans. The key to the realism of
counterfactuals is that these classifiers encode salient features indicative
for each disease class and can steer the diffusion model to depict disease
signs or remove disease-related lesions in a realistic way. In a user study,
domain experts also found the counterfactuals generated using our method
significantly more realistic than counterfactuals generated from a previous
method, and even indistinguishable from real images
Conditional Generation of Medical Images via Disentangled Adversarial Inference
Synthetic medical image generation has a huge potential for improving
healthcare through many applications, from data augmentation for training
machine learning systems to preserving patient privacy. Conditional Adversarial
Generative Networks (cGANs) use a conditioning factor to generate images and
have shown great success in recent years. Intuitively, the information in an
image can be divided into two parts: 1) content which is presented through the
conditioning vector and 2) style which is the undiscovered information missing
from the conditioning vector. Current practices in using cGANs for medical
image generation, only use a single variable for image generation (i.e.,
content) and therefore, do not provide much flexibility nor control over the
generated image. In this work we propose a methodology to learn from the image
itself, disentangled representations of style and content, and use this
information to impose control over the generation process. In this framework,
style is learned in a fully unsupervised manner, while content is learned
through both supervised learning (using the conditioning vector) and
unsupervised learning (with the inference mechanism). We undergo two novel
regularization steps to ensure content-style disentanglement. First, we
minimize the shared information between content and style by introducing a
novel application of the gradient reverse layer (GRL); second, we introduce a
self-supervised regularization method to further separate information in the
content and style variables. We show that in general, two latent variable
models achieve better performance and give more control over the generated
image. We also show that our proposed model (DRAI) achieves the best
disentanglement score and has the best overall performance.Comment: Published in Medical Image Analysi
A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models
Denoising diffusion models have found applications in image segmentation by
generating segmented masks conditioned on images. Existing studies
predominantly focus on adjusting model architecture or improving inference,
such as test-time sampling strategies. In this work, we focus on improving the
training strategy and propose a novel recycling method. During each training
step, a segmentation mask is first predicted given an image and a random noise.
This predicted mask, which replaces the conventional ground truth mask, is used
for denoising task during training. This approach can be interpreted as
aligning the training strategy with inference by eliminating the dependence on
ground truth masks for generating noisy samples. Our proposed method
significantly outperforms standard diffusion training, self-conditioning, and
existing recycling strategies across multiple medical imaging data sets: muscle
ultrasound, abdominal CT, prostate MR, and brain MR. This holds for two widely
adopted sampling strategies: denoising diffusion probabilistic model and
denoising diffusion implicit model. Importantly, existing diffusion models
often display a declining or unstable performance during inference, whereas our
novel recycling consistently enhances or maintains performance. We show that,
under a fair comparison with the same network architectures and computing
budget, the proposed recycling-based diffusion models achieved on-par
performance with non-diffusion-based supervised training. By ensembling the
proposed diffusion and the non-diffusion models, significant improvements to
the non-diffusion models have been observed across all applications,
demonstrating the value of this novel training method. This paper summarizes
these quantitative results and discusses their values, with a fully
reproducible JAX-based implementation, released at
https://github.com/mathpluscode/ImgX-DiffSeg.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) https://melba-journal.org/2023:01
Data-centric Design and Training of Deep Neural Networks with Multiple Data Modalities for Vision-based Perception Systems
224 p.Los avances en visión artificial y aprendizaje automático han revolucionado la capacidad de construir sistemas que procesen e interpreten datos digitales, permitiéndoles imitar la percepción humana y abriendo el camino a un amplio rango de aplicaciones. En los últimos años, ambas disciplinas han logrado avances significativos,impulsadas por los progresos en las técnicas de aprendizaje profundo(deep learning). El aprendizaje profundo es una disciplina que utiliza redes neuronales profundas (DNNs, por sus siglas en inglés) para enseñar a las máquinas a reconocer patrones y hacer predicciones basadas en datos. Los sistemas de percepción basados en el aprendizaje profundo son cada vez más frecuentes en diversos campos, donde humanos y máquinas colaboran para combinar sus fortalezas.Estos campos incluyen la automoción, la industria o la medicina, donde mejorar la seguridad, apoyar el diagnóstico y automatizar tareas repetitivas son algunos de los objetivos perseguidos.Sin embargo, los datos son uno de los factores clave detrás del éxito de los algoritmos de aprendizaje profundo. La dependencia de datos limita fuertemente la creación y el éxito de nuevas DNN. La disponibilidad de datos de calidad para resolver un problema especÃfico es esencial pero difÃcil de obtener, incluso impracticable,en la mayorÃa de los desarrollos. La inteligencia artificial centrada en datos enfatiza la importancia de usar datos de alta calidad que transmitan de manera efectiva lo que un modelo debe aprender. Motivada por los desafÃos y la necesidad de los datos, esta tesis formula y valida cinco hipótesis sobre la adquisición y el impacto de los datos en el diseño y entrenamiento de las DNNs.EspecÃficamente, investigamos y proponemos diferentes metodologÃas para obtener datos adecuados para entrenar DNNs en problemas con acceso limitado a fuentes de datos de gran escala. Exploramos dos posibles soluciones para la obtención de datos de entrenamiento, basadas en la generación de datos sintéticos. En primer lugar, investigamos la generación de datos sintéticos utilizando gráficos 3D y el impacto de diferentes opciones de diseño en la precisión de los DNN obtenidos. Además, proponemos una metodologÃa para automatizar el proceso de generación de datos y producir datos anotados variados, mediante la replicación de un entorno 3D personalizado a partir de un archivo de configuración de entrada. En segundo lugar, proponemos una red neuronal generativa(GAN) que genera imágenes anotadas utilizando conjuntos de datos anotados limitados y datos sin anotaciones capturados en entornos no controlados
A cortical model of object perception based on Bayesian networks and belief propagation.
Evidence suggests that high-level feedback plays an important role in visual perception by shaping
the response in lower cortical levels (Sillito et al. 2006, Angelucci and Bullier 2003, Bullier
2001, Harrison et al. 2007). A notable example of this is reflected by the retinotopic activation
of V1 and V2 neurons in response to illusory contours, such as Kanizsa figures, which has been
reported in numerous studies (Maertens et al. 2008, Seghier and Vuilleumier 2006, Halgren et al.
2003, Lee 2003, Lee and Nguyen 2001). The illusory contour activity emerges first in lateral
occipital cortex (LOC), then in V2 and finally in V1, strongly suggesting that the response is
driven by feedback connections. Generative models and Bayesian belief propagation have been
suggested to provide a theoretical framework that can account for feedback connectivity, explain
psychophysical and physiological results, and map well onto the hierarchical distributed
cortical connectivity (Friston and Kiebel 2009, Dayan et al. 1995, Knill and Richards 1996,
Geisler and Kersten 2002, Yuille and Kersten 2006, Deneve 2008a, George and Hawkins 2009,
Lee and Mumford 2003, Rao 2006, Litvak and Ullman 2009, Steimer et al. 2009).
The present study explores the role of feedback in object perception, taking as a starting point
the HMAX model, a biologically inspired hierarchical model of object recognition (Riesenhuber
and Poggio 1999, Serre et al. 2007b), and extending it to include feedback connectivity.
A Bayesian network that captures the structure and properties of the HMAX model is
developed, replacing the classical deterministic view with a probabilistic interpretation. The
proposed model approximates the selectivity and invariance operations of the HMAX model
using the belief propagation algorithm. Hence, the model not only achieves successful feedforward
recognition invariant to position and size, but is also able to reproduce modulatory effects
of higher-level feedback, such as illusory contour completion, attention and mental imagery.
Overall, the model provides a biophysiologically plausible interpretation, based on state-of-theart
probabilistic approaches and supported by current experimental evidence, of the interaction
between top-down global feedback and bottom-up local evidence in the context of hierarchical
object perception
Embodiment and Grammatical Structure: An Approach to the Relation of Experience, Assertion and Truth
In this thesis I address a concern in both existential phenomenology and embodied cognition, namely, the question of how ‘higher’ cognitive abilities such as language and judgements of truth relate to embodied experience. I suggest that although our words are grounded in experience, what makes this grounding and our higher abilities possible is grammatical structure.
The opening chapter contrasts the ‘situated’ approach of embodied cognition and existential phenomenology with Cartesian methodological solipsism. The latter produces a series of dualisms, including that of language and meaning, whereas the former dissolves such dualisms. The second chapter adapts Merleau-Ponty’s arguments against the perceptual constancy hypothesis in order to undermine the dualism of grammar and meaning. This raises the question of what grammar is, which is addressed in the third chapter. I acknowledge the force of Chomsky’s observation that language is structure dependent and briefly introduce a minimal grammatical operation which might be the ‘spark which lit the intellectual forest fire’ (Clark: 2001, 151).
Grammatical relations are argued to make possible the grounding of our symbols in chapters 4 and 5, which attempt to ground the categories of determiner and aspect in spatial deixis and embodied motor processes respectively. Chapter 6 ties the previous three together, arguing that we may understand a given lexeme as an object or as an event by subsuming it within a determiner phrase or aspectualising it respectively. I suggest that such modification of a word’s meaning is possible because determiners and aspect schematise, i.e. determine the temporal structure, of the lexeme. Chapter 7 uses this account to take up Heidegger’s claim that the relation between being and truth be cast in terms of temporality (2006, H349), though falls short of providing a complete account of the ‘origin of truth’. Chapter 8 concludes and notes further avenues of research
Cassirer and structuralism of perception: an application of group theory to Gestalt psychology
Ernst Cassirer's task was to set up an account of perception as objective judgement. We can trace Cassirer's view of perception through three different accounts each of which aimed to give an answer of how perceptual judgements can be possible. These three accounts started from (1900-1923) where he presented his view depending on Functional- Relational analysis of perceptual experience. The second account started from (1923-1933) where he presented his view of perception depending on symbolic analysis of perceptual experience, and finally the third account started from (1933-1945) where the analysis of perceptual phenomena has been made depending on his apprehension of Group Theory. The main target of Cassirer in the third account was to show that there is similarity between geometry and perception with respect to the ways both of these two disciplines build up their objects. Having the same logical base, Cassirer claimed that there is similarity between geometrical determination of the object and perceptual determination of the experienced object. For Cassirer, this similarity is what allows an application of "group theory" to perception. As a result of that claim, Cassirer shifted mathematical terms such as "invariance", "frame of reference" and "transformation" from the province of geometry and reused them in the field of perception for setting up what he called psychology of thought. This thesis discusses Cassirer's first two accounts and focuses on the third account by giving examples of how the mathematical concept of "group" can be used as an analogy to provide an intrinsic explanation of the nature of the objects and their characteristics one experiences during the perceptual situation. The explanations of the perceptual phenomena represented in the perceptual experience, as given by Cassirer, based on Gestalt psychology, reflected this understanding. The ample examples created by the Gestalt psychologists and used by Cassirer indicated how both understood the object of perceptual experience as constructed and not as a thing or hic et nunc. I will show that in these three accounts, there are non-physical elements, which defined here as structural elements, involved in the perceptual experience. By the virtue of these non-physical elements, perceptual judgements are possible. Cassirer and the Gestalt psychologists emphasized that these structural elements are presupposed in every perceptual experience and this understanding will lead to the claim that both Cassirer and the Gestaltists presupposed the constructive unity of mind based on a transcendental analysis of the nature of mind and its cognitive processes
- …