354 research outputs found

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    End-to-end Prostate Cancer Detection in bpMRI via 3D CNNs: Effects of Attention Mechanisms, Clinical Priori and Decoupled False Positive Reduction

    Full text link
    We present a multi-stage 3D computer-aided detection and diagnosis (CAD) model for automated localization of clinically significant prostate cancer (csPCa) in bi-parametric MR imaging (bpMRI). Deep attention mechanisms drive its detection network, targeting salient structures and highly discriminative feature dimensions across multiple resolutions. Its goal is to accurately identify csPCa lesions from indolent cancer and the wide range of benign pathology that can afflict the prostate gland. Simultaneously, a decoupled residual classifier is used to achieve consistent false positive reduction, without sacrificing high sensitivity or computational efficiency. In order to guide model generalization with domain-specific clinical knowledge, a probabilistic anatomical prior is used to encode the spatial prevalence and zonal distinction of csPCa. Using a large dataset of 1950 prostate bpMRI paired with radiologically-estimated annotations, we hypothesize that such CNN-based models can be trained to detect biopsy-confirmed malignancies in an independent cohort. For 486 institutional testing scans, the 3D CAD system achieves 83.69±\pm5.22% and 93.19±\pm2.96% detection sensitivity at 0.50 and 1.46 false positive(s) per patient, respectively, with 0.882±\pm0.030 AUROC in patient-based diagnosis -significantly outperforming four state-of-the-art baseline architectures (U-SEResNet, UNet++, nnU-Net, Attention U-Net) from recent literature. For 296 external biopsy-confirmed testing scans, the ensembled CAD system shares moderate agreement with a consensus of expert radiologists (76.69%; kappakappa == 0.51±\pm0.04) and independent pathologists (81.08%; kappakappa == 0.56±\pm0.06); demonstrating strong generalization to histologically-confirmed csPCa diagnosis.Comment: Accepted to MedIA: Medical Image Analysis. This manuscript incorporates and expands upon our 2020 Medical Imaging Meets NeurIPS Workshop paper (arXiv:2011.00263

    Anatomical Segmentation of CT images for Radiation Therapy planning using Deep Learning

    Get PDF
    Radiation therapy is one of the key cancer treatment options. To avoid adverse effect in tissue surrounding the tumor, the treatment plan needs to be based on accurate anatomical models of the patient. In this thesis, an automatic segmentation solution is constructed for the female breast, the female pelvis and the male pelvis using deep learning. The deep neural networks applied performed as well as the current state of the art networks while improving inference speed by a factor of 15 to 45. The speed increase was gained through processing the whole 3D image at once. The segmentations done by clinicians usually take several hours, whereas the automatic segmentation can be done in less than a second. Therefore, the automatic segmentation provides options for adaptive treatment planning

    Image Processing and Analysis for Preclinical and Clinical Applications

    Get PDF
    Radiomics is one of the most successful branches of research in the field of image processing and analysis, as it provides valuable quantitative information for the personalized medicine. It has the potential to discover features of the disease that cannot be appreciated with the naked eye in both preclinical and clinical studies. In general, all quantitative approaches based on biomedical images, such as positron emission tomography (PET), computed tomography (CT) and magnetic resonance imaging (MRI), have a positive clinical impact in the detection of biological processes and diseases as well as in predicting response to treatment. This Special Issue, “Image Processing and Analysis for Preclinical and Clinical Applications”, addresses some gaps in this field to improve the quality of research in the clinical and preclinical environment. It consists of fourteen peer-reviewed papers covering a range of topics and applications related to biomedical image processing and analysis

    PWD-3DNet: A deep learning-based fully-automated segmentation of multiple structures on temporal bone CT scans

    Get PDF
    The temporal bone is a part of the lateral skull surface that contains organs responsible for hearing and balance. Mastering surgery of the temporal bone is challenging because of this complex and microscopic three-dimensional anatomy. Segmentation of intra-temporal anatomy based on computed tomography (CT) images is necessary for applications such as surgical training and rehearsal, amongst others. However, temporal bone segmentation is challenging due to the similar intensities and complicated anatomical relationships among crit- ical structures, undetectable small structures on standard clinical CT, and the amount of time required for manual segmentation. This paper describes a single multi-class deep learning-based pipeline as the first fully automated algorithm for segmenting multiple temporal bone structures from CT volumes, including the sigmoid sinus, facial nerve, inner ear, malleus, incus, stapes, internal carotid artery and internal auditory canal. The proposed fully convolutional network, PWD-3DNet, is a patch-wise densely connected (PWD) three-dimensional (3D) network. The accuracy and speed of the proposed algorithm was shown to surpass current manual and semi-automated segmentation techniques. The experimental results yielded significantly high Dice similar- ity scores and low Hausdorff distances for all temporal bone structures with an average of 86% and 0.755 millimeter (mm), respectively. We illustrated that overlapping in the inference sub-volumes improves the segmentation performance. Moreover, we proposed augmentation layers by using samples with various transformations and image artefacts to increase the robustness of PWD-3DNet against image acquisition protocols, such as smoothing caused by soft tissue scanner settings and larger voxel sizes used for radiation reduction. The proposed algorithm was tested on low-resolution CTs acquired by another center with different scanner parameters than the ones used to create the algorithm and shows potential for application beyond the particular training data used in the study

    PWD-3DNet: A Deep Learning-Based Fully-Automated Segmentation of Multiple Structures on Temporal Bone CT Scans

    Get PDF
    The temporal bone is a part of the lateral skull surface that contains organs responsible for hearing and balance. Mastering surgery of the temporal bone is challenging because of this complex and microscopic three-dimensional anatomy. Segmentation of intra-temporal anatomy based on computed tomography (CT) images is necessary for applications such as surgical training and rehearsal, amongst others. However, temporal bone segmentation is challenging due to the similar intensities and complicated anatomical relationships among critical structures, undetectable small structures on standard clinical CT, and the amount of time required for manual segmentation. This paper describes a single multi-class deep learning-based pipeline as the first fully automated algorithm for segmenting multiple temporal bone structures from CT volumes, including the sigmoid sinus, facial nerve, inner ear, malleus, incus, stapes, internal carotid artery and internal auditory canal. The proposed fully convolutional network, PWD-3DNet, is a patch-wise densely connected (PWD) three-dimensional (3D) network. The accuracy and speed of the proposed algorithm was shown to surpass current manual and semi-automated segmentation techniques. The experimental results yielded significantly high Dice similarity scores and low Hausdorff distances for all temporal bone structures with an average of 86% and 0.755 millimeter (mm), respectively. We illustrated that overlapping in the inference sub-volumes improves the segmentation performance. Moreover, we proposed augmentation layers by using samples with various transformations and image artefacts to increase the robustness of PWD-3DNet against image acquisition protocols, such as smoothing caused by soft tissue scanner settings and larger voxel sizes used for radiation reduction. The proposed algorithm was tested on low-resolution CTs acquired by another center with different scanner parameters than the ones used to create the algorithm and shows potential for application beyond the particular training data used in the study

    Recalage déformable à base de graphes : mise en correspondance coupe-vers-volume et méthodes contextuelles

    Get PDF
    Image registration methods, which aim at aligning two or more images into one coordinate system, are among the oldest and most widely used algorithms in computer vision. Registration methods serve to establish correspondence relationships among images (captured at different times, from different sensors or from different viewpoints) which are not obvious for the human eye. A particular type of registration algorithm, known as graph-based deformable registration methods, has become popular during the last decade given its robustness, scalability, efficiency and theoretical simplicity. The range of problems to which it can be adapted is particularly broad. In this thesis, we propose several extensions to the graph-based deformable registration theory, by exploring new application scenarios and developing novel methodological contributions.Our first contribution is an extension of the graph-based deformable registration framework, dealing with the challenging slice-to-volume registration problem. Slice-to-volume registration aims at registering a 2D image within a 3D volume, i.e. we seek a mapping function which optimally maps a tomographic slice to the 3D coordinate space of a given volume. We introduce a scalable, modular and flexible formulation accommodating low-rank and high order terms, which simultaneously selects the plane and estimates the in-plane deformation through a single shot optimization approach. The proposed framework is instantiated into different variants based on different graph topology, label space definition and energy construction. Simulated and real-data in the context of ultrasound and magnetic resonance registration (where both framework instantiations as well as different optimization strategies are considered) demonstrate the potentials of our method.The other two contributions included in this thesis are related to how semantic information can be encompassed within the registration process (independently of the dimensionality of the images). Currently, most of the methods rely on a single metric function explaining the similarity between the source and target images. We argue that incorporating semantic information to guide the registration process will further improve the accuracy of the results, particularly in the presence of semantic labels making the registration a domain specific problem.We consider a first scenario where we are given a classifier inferring probability maps for different anatomical structures in the input images. Our method seeks to simultaneously register and segment a set of input images, incorporating this information within the energy formulation. The main idea is to use these estimated maps of semantic labels (provided by an arbitrary classifier) as a surrogate for unlabeled data, and combine them with population deformable registration to improve both alignment and segmentation.Our last contribution also aims at incorporating semantic information to the registration process, but in a different scenario. In this case, instead of supposing that we have pre-trained arbitrary classifiers at our disposal, we are given a set of accurate ground truth annotations for a variety of anatomical structures. We present a methodological contribution that aims at learning context specific matching criteria as an aggregation of standard similarity measures from the aforementioned annotated data, using an adapted version of the latent structured support vector machine (LSSVM) framework.Les méthodes de recalage d’images, qui ont pour but l’alignement de deux ou plusieurs images dans un même système de coordonnées, sont parmi les algorithmes les plus anciens et les plus utilisés en vision par ordinateur. Les méthodes de recalage servent à établir des correspondances entre des images (prises à des moments différents, par différents senseurs ou avec différentes perspectives), lesquelles ne sont pas évidentes pour l’œil humain. Un type particulier d’algorithme de recalage, connu comme « les méthodes de recalage déformables à l’aide de modèles graphiques » est devenu de plus en plus populaire ces dernières années, grâce à sa robustesse, sa scalabilité, son efficacité et sa simplicité théorique. La gamme des problèmes auxquels ce type d’algorithme peut être adapté est particulièrement vaste. Dans ce travail de thèse, nous proposons plusieurs extensions à la théorie de recalage déformable à l’aide de modèles graphiques, en explorant de nouvelles applications et en développant des contributions méthodologiques originales.Notre première contribution est une extension du cadre du recalage à l’aide de graphes, en abordant le problème très complexe du recalage d’une tranche avec un volume. Le recalage d’une tranche avec un volume est le recalage 2D dans un volume 3D, comme par exemple le mapping d’une tranche tomographique dans un système de coordonnées 3D d’un volume en particulier. Nos avons proposé une formulation scalable, modulaire et flexible pour accommoder des termes d'ordre élevé et de rang bas, qui peut sélectionner le plan et estimer la déformation dans le plan de manière simultanée par une seule approche d'optimisation. Le cadre proposé est instancié en différentes variantes, basés sur différentes topologies du graph, définitions de l'espace des étiquettes et constructions de l'énergie. Le potentiel de notre méthode a été démontré sur des données réelles ainsi que des données simulées dans le cadre d’une résonance magnétique d’ultrason (où le cadre d’installation et les stratégies d’optimisation ont été considérés).Les deux autres contributions inclues dans ce travail de thèse, sont liées au problème de l’intégration de l’information sémantique dans la procédure de recalage (indépendamment de la dimensionnalité des images). Actuellement, la plupart des méthodes comprennent une seule fonction métrique pour expliquer la similarité entre l’image source et l’image cible. Nous soutenons que l'intégration des informations sémantiques pour guider la procédure de recalage pourra encore améliorer la précision des résultats, en particulier en présence d'étiquettes sémantiques faisant du recalage un problème spécifique adapté à chaque domaine.Nous considérons un premier scénario en proposant un classificateur pour inférer des cartes de probabilité pour les différentes structures anatomiques dans les images d'entrée. Notre méthode vise à recaler et segmenter un ensemble d'images d'entrée simultanément, en intégrant cette information dans la formulation de l'énergie. L'idée principale est d'utiliser ces cartes estimées des étiquettes sémantiques (fournie par un classificateur arbitraire) comme un substitut pour les données non-étiquettées, et les combiner avec le recalage déformable pour améliorer l'alignement ainsi que la segmentation.Notre dernière contribution vise également à intégrer l'information sémantique pour la procédure de recalage, mais dans un scénario différent. Dans ce cas, au lieu de supposer que nous avons des classificateurs arbitraires pré-entraînés à notre disposition, nous considérons un ensemble d’annotations précis (vérité terrain) pour une variété de structures anatomiques. Nous présentons une contribution méthodologique qui vise à l'apprentissage des critères correspondants au contexte spécifique comme une agrégation des mesures de similarité standard à partir des données annotées, en utilisant une adaptation de l’algorithme « Latent Structured Support Vector Machine »
    corecore