85 research outputs found

    Three-dimensional Bone Image Synthesis with Generative Adversarial Networks

    Full text link
    Medical image processing has been highlighted as an area where deep learning-based models have the greatest potential. However, in the medical field in particular, problems of data availability and privacy are hampering research progress and thus rapid implementation in clinical routine. The generation of synthetic data not only ensures privacy, but also allows to \textit{draw} new patients with specific characteristics, enabling the development of data-driven models on a much larger scale. This work demonstrates that three-dimensional generative adversarial networks (GANs) can be efficiently trained to generate high-resolution medical volumes with finely detailed voxel-based architectures. In addition, GAN inversion is successfully implemented for the three-dimensional setting and used for extensive research on model interpretability and applications such as image morphing, attribute editing and style mixing. The results are comprehensively validated on a database of three-dimensional HR-pQCT instances representing the bone micro-architecture of the distal radius.Comment: Submitted to the journal Artificial Intelligence in Medicin

    Generating Realistic Counterfactuals for Retinal Fundus and OCT Images using Diffusion Models

    Full text link
    Counterfactual reasoning is often used in clinical settings to explain decisions or weigh alternatives. Therefore, for imaging based specialties such as ophthalmology, it would be beneficial to be able to create counterfactual images, illustrating answers to questions like "If the subject had had diabetic retinopathy, how would the fundus image have looked?". Here, we demonstrate that using a diffusion model in combination with an adversarially robust classifier trained on retinal disease classification tasks enables the generation of highly realistic counterfactuals of retinal fundus images and optical coherence tomography (OCT) B-scans. The key to the realism of counterfactuals is that these classifiers encode salient features indicative for each disease class and can steer the diffusion model to depict disease signs or remove disease-related lesions in a realistic way. In a user study, domain experts also found the counterfactuals generated using our method significantly more realistic than counterfactuals generated from a previous method, and even indistinguishable from real images

    Conditional Generation of Medical Images via Disentangled Adversarial Inference

    Full text link
    Synthetic medical image generation has a huge potential for improving healthcare through many applications, from data augmentation for training machine learning systems to preserving patient privacy. Conditional Adversarial Generative Networks (cGANs) use a conditioning factor to generate images and have shown great success in recent years. Intuitively, the information in an image can be divided into two parts: 1) content which is presented through the conditioning vector and 2) style which is the undiscovered information missing from the conditioning vector. Current practices in using cGANs for medical image generation, only use a single variable for image generation (i.e., content) and therefore, do not provide much flexibility nor control over the generated image. In this work we propose a methodology to learn from the image itself, disentangled representations of style and content, and use this information to impose control over the generation process. In this framework, style is learned in a fully unsupervised manner, while content is learned through both supervised learning (using the conditioning vector) and unsupervised learning (with the inference mechanism). We undergo two novel regularization steps to ensure content-style disentanglement. First, we minimize the shared information between content and style by introducing a novel application of the gradient reverse layer (GRL); second, we introduce a self-supervised regularization method to further separate information in the content and style variables. We show that in general, two latent variable models achieve better performance and give more control over the generated image. We also show that our proposed model (DRAI) achieves the best disentanglement score and has the best overall performance.Comment: Published in Medical Image Analysi

    A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models

    Full text link
    Denoising diffusion models have found applications in image segmentation by generating segmented masks conditioned on images. Existing studies predominantly focus on adjusting model architecture or improving inference, such as test-time sampling strategies. In this work, we focus on improving the training strategy and propose a novel recycling method. During each training step, a segmentation mask is first predicted given an image and a random noise. This predicted mask, which replaces the conventional ground truth mask, is used for denoising task during training. This approach can be interpreted as aligning the training strategy with inference by eliminating the dependence on ground truth masks for generating noisy samples. Our proposed method significantly outperforms standard diffusion training, self-conditioning, and existing recycling strategies across multiple medical imaging data sets: muscle ultrasound, abdominal CT, prostate MR, and brain MR. This holds for two widely adopted sampling strategies: denoising diffusion probabilistic model and denoising diffusion implicit model. Importantly, existing diffusion models often display a declining or unstable performance during inference, whereas our novel recycling consistently enhances or maintains performance. We show that, under a fair comparison with the same network architectures and computing budget, the proposed recycling-based diffusion models achieved on-par performance with non-diffusion-based supervised training. By ensembling the proposed diffusion and the non-diffusion models, significant improvements to the non-diffusion models have been observed across all applications, demonstrating the value of this novel training method. This paper summarizes these quantitative results and discusses their values, with a fully reproducible JAX-based implementation, released at https://github.com/mathpluscode/ImgX-DiffSeg.Comment: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:01

    Data-centric Design and Training of Deep Neural Networks with Multiple Data Modalities for Vision-based Perception Systems

    Get PDF
    224 p.Los avances en visión artificial y aprendizaje automático han revolucionado la capacidad de construir sistemas que procesen e interpreten datos digitales, permitiéndoles imitar la percepción humana y abriendo el camino a un amplio rango de aplicaciones. En los últimos años, ambas disciplinas han logrado avances significativos,impulsadas por los progresos en las técnicas de aprendizaje profundo(deep learning). El aprendizaje profundo es una disciplina que utiliza redes neuronales profundas (DNNs, por sus siglas en inglés) para enseñar a las máquinas a reconocer patrones y hacer predicciones basadas en datos. Los sistemas de percepción basados en el aprendizaje profundo son cada vez más frecuentes en diversos campos, donde humanos y máquinas colaboran para combinar sus fortalezas.Estos campos incluyen la automoción, la industria o la medicina, donde mejorar la seguridad, apoyar el diagnóstico y automatizar tareas repetitivas son algunos de los objetivos perseguidos.Sin embargo, los datos son uno de los factores clave detrás del éxito de los algoritmos de aprendizaje profundo. La dependencia de datos limita fuertemente la creación y el éxito de nuevas DNN. La disponibilidad de datos de calidad para resolver un problema específico es esencial pero difícil de obtener, incluso impracticable,en la mayoría de los desarrollos. La inteligencia artificial centrada en datos enfatiza la importancia de usar datos de alta calidad que transmitan de manera efectiva lo que un modelo debe aprender. Motivada por los desafíos y la necesidad de los datos, esta tesis formula y valida cinco hipótesis sobre la adquisición y el impacto de los datos en el diseño y entrenamiento de las DNNs.Específicamente, investigamos y proponemos diferentes metodologías para obtener datos adecuados para entrenar DNNs en problemas con acceso limitado a fuentes de datos de gran escala. Exploramos dos posibles soluciones para la obtención de datos de entrenamiento, basadas en la generación de datos sintéticos. En primer lugar, investigamos la generación de datos sintéticos utilizando gráficos 3D y el impacto de diferentes opciones de diseño en la precisión de los DNN obtenidos. Además, proponemos una metodología para automatizar el proceso de generación de datos y producir datos anotados variados, mediante la replicación de un entorno 3D personalizado a partir de un archivo de configuración de entrada. En segundo lugar, proponemos una red neuronal generativa(GAN) que genera imágenes anotadas utilizando conjuntos de datos anotados limitados y datos sin anotaciones capturados en entornos no controlados

    A cortical model of object perception based on Bayesian networks and belief propagation.

    Get PDF
    Evidence suggests that high-level feedback plays an important role in visual perception by shaping the response in lower cortical levels (Sillito et al. 2006, Angelucci and Bullier 2003, Bullier 2001, Harrison et al. 2007). A notable example of this is reflected by the retinotopic activation of V1 and V2 neurons in response to illusory contours, such as Kanizsa figures, which has been reported in numerous studies (Maertens et al. 2008, Seghier and Vuilleumier 2006, Halgren et al. 2003, Lee 2003, Lee and Nguyen 2001). The illusory contour activity emerges first in lateral occipital cortex (LOC), then in V2 and finally in V1, strongly suggesting that the response is driven by feedback connections. Generative models and Bayesian belief propagation have been suggested to provide a theoretical framework that can account for feedback connectivity, explain psychophysical and physiological results, and map well onto the hierarchical distributed cortical connectivity (Friston and Kiebel 2009, Dayan et al. 1995, Knill and Richards 1996, Geisler and Kersten 2002, Yuille and Kersten 2006, Deneve 2008a, George and Hawkins 2009, Lee and Mumford 2003, Rao 2006, Litvak and Ullman 2009, Steimer et al. 2009). The present study explores the role of feedback in object perception, taking as a starting point the HMAX model, a biologically inspired hierarchical model of object recognition (Riesenhuber and Poggio 1999, Serre et al. 2007b), and extending it to include feedback connectivity. A Bayesian network that captures the structure and properties of the HMAX model is developed, replacing the classical deterministic view with a probabilistic interpretation. The proposed model approximates the selectivity and invariance operations of the HMAX model using the belief propagation algorithm. Hence, the model not only achieves successful feedforward recognition invariant to position and size, but is also able to reproduce modulatory effects of higher-level feedback, such as illusory contour completion, attention and mental imagery. Overall, the model provides a biophysiologically plausible interpretation, based on state-of-theart probabilistic approaches and supported by current experimental evidence, of the interaction between top-down global feedback and bottom-up local evidence in the context of hierarchical object perception

    Embodiment and Grammatical Structure: An Approach to the Relation of Experience, Assertion and Truth

    Get PDF
    In this thesis I address a concern in both existential phenomenology and embodied cognition, namely, the question of how ‘higher’ cognitive abilities such as language and judgements of truth relate to embodied experience. I suggest that although our words are grounded in experience, what makes this grounding and our higher abilities possible is grammatical structure. The opening chapter contrasts the ‘situated’ approach of embodied cognition and existential phenomenology with Cartesian methodological solipsism. The latter produces a series of dualisms, including that of language and meaning, whereas the former dissolves such dualisms. The second chapter adapts Merleau-Ponty’s arguments against the perceptual constancy hypothesis in order to undermine the dualism of grammar and meaning. This raises the question of what grammar is, which is addressed in the third chapter. I acknowledge the force of Chomsky’s observation that language is structure dependent and briefly introduce a minimal grammatical operation which might be the ‘spark which lit the intellectual forest fire’ (Clark: 2001, 151). Grammatical relations are argued to make possible the grounding of our symbols in chapters 4 and 5, which attempt to ground the categories of determiner and aspect in spatial deixis and embodied motor processes respectively. Chapter 6 ties the previous three together, arguing that we may understand a given lexeme as an object or as an event by subsuming it within a determiner phrase or aspectualising it respectively. I suggest that such modification of a word’s meaning is possible because determiners and aspect schematise, i.e. determine the temporal structure, of the lexeme. Chapter 7 uses this account to take up Heidegger’s claim that the relation between being and truth be cast in terms of temporality (2006, H349), though falls short of providing a complete account of the ‘origin of truth’. Chapter 8 concludes and notes further avenues of research

    Cassirer and structuralism of perception: an application of group theory to Gestalt psychology

    Get PDF
    Ernst Cassirer's task was to set up an account of perception as objective judgement. We can trace Cassirer's view of perception through three different accounts each of which aimed to give an answer of how perceptual judgements can be possible. These three accounts started from (1900-1923) where he presented his view depending on Functional- Relational analysis of perceptual experience. The second account started from (1923-1933) where he presented his view of perception depending on symbolic analysis of perceptual experience, and finally the third account started from (1933-1945) where the analysis of perceptual phenomena has been made depending on his apprehension of Group Theory. The main target of Cassirer in the third account was to show that there is similarity between geometry and perception with respect to the ways both of these two disciplines build up their objects. Having the same logical base, Cassirer claimed that there is similarity between geometrical determination of the object and perceptual determination of the experienced object. For Cassirer, this similarity is what allows an application of "group theory" to perception. As a result of that claim, Cassirer shifted mathematical terms such as "invariance", "frame of reference" and "transformation" from the province of geometry and reused them in the field of perception for setting up what he called psychology of thought. This thesis discusses Cassirer's first two accounts and focuses on the third account by giving examples of how the mathematical concept of "group" can be used as an analogy to provide an intrinsic explanation of the nature of the objects and their characteristics one experiences during the perceptual situation. The explanations of the perceptual phenomena represented in the perceptual experience, as given by Cassirer, based on Gestalt psychology, reflected this understanding. The ample examples created by the Gestalt psychologists and used by Cassirer indicated how both understood the object of perceptual experience as constructed and not as a thing or hic et nunc. I will show that in these three accounts, there are non-physical elements, which defined here as structural elements, involved in the perceptual experience. By the virtue of these non-physical elements, perceptual judgements are possible. Cassirer and the Gestalt psychologists emphasized that these structural elements are presupposed in every perceptual experience and this understanding will lead to the claim that both Cassirer and the Gestaltists presupposed the constructive unity of mind based on a transcendental analysis of the nature of mind and its cognitive processes
    • …
    corecore