48 research outputs found

    Landmark Tracking in Liver US images Using Cascade Convolutional Neural Networks with Long Short-Term Memory

    Full text link
    This study proposed a deep learning-based tracking method for ultrasound (US) image-guided radiation therapy. The proposed cascade deep learning model is composed of an attention network, a mask region-based convolutional neural network (mask R-CNN), and a long short-term memory (LSTM) network. The attention network learns a mapping from a US image to a suspected area of landmark motion in order to reduce the search region. The mask R-CNN then produces multiple region-of-interest (ROI) proposals in the reduced region and identifies the proposed landmark via three network heads: bounding box regression, proposal classification, and landmark segmentation. The LSTM network models the temporal relationship among the successive image frames for bounding box regression and proposal classification. To consolidate the final proposal, a selection method is designed according to the similarities between sequential frames. The proposed method was tested on the liver US tracking datasets used in the Medical Image Computing and Computer Assisted Interventions (MICCAI) 2015 challenges, where the landmarks were annotated by three experienced observers to obtain their mean positions. Five-fold cross-validation on the 24 given US sequences with ground truths shows that the mean tracking error for all landmarks is 0.65+/-0.56 mm, and the errors of all landmarks are within 2 mm. We further tested the proposed model on 69 landmarks from the testing dataset that has a similar image pattern to the training pattern, resulting in a mean tracking error of 0.94+/-0.83 mm. Our experimental results have demonstrated the feasibility and accuracy of our proposed method in tracking liver anatomic landmarks using US images, providing a potential solution for real-time liver tracking for active motion management during radiation therapy

    Transformer Lesion Tracker

    Full text link
    Evaluating lesion progression and treatment response via longitudinal lesion tracking plays a critical role in clinical practice. Automated approaches for this task are motivated by prohibitive labor costs and time consumption when lesion matching is done manually. Previous methods typically lack the integration of local and global information. In this work, we propose a transformer-based approach, termed Transformer Lesion Tracker (TLT). Specifically, we design a Cross Attention-based Transformer (CAT) to capture and combine both global and local information to enhance feature extraction. We also develop a Registration-based Anatomical Attention Module (RAAM) to introduce anatomical information to CAT so that it can focus on useful feature knowledge. A Sparse Selection Strategy (SSS) is presented for selecting features and reducing memory footprint in Transformer training. In addition, we use a global regression to further improve model performance. We conduct experiments on a public dataset to show the superiority of our method and find that our model performance has improved the average Euclidean center error by at least 14.3% (6mm vs. 7mm) compared with the state-of-the-art (SOTA). Code is available at https://github.com/TangWen920812/TLT.Comment: Accepted MICCAI 202

    A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond

    Full text link
    Over the past decade, deep learning technologies have greatly advanced the field of medical image registration. The initial developments, such as ResNet-based and U-Net-based networks, laid the groundwork for deep learning-driven image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, and uncertainty estimation. These advancements have not only enriched the field of deformable image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration

    Label Efficient Deep Learning in Medical Imaging

    Get PDF
    Recent state-of-the-art deep learning frameworks require large, fully annotated training datasets that are, depending on the objective, time-consuming to generate. While in most fields, these labelling tasks can be parallelized massively or even outsourced, this is not the case for medical images. Usually, only a highly trained expert is able to generate these datasets. However, since additional manual annotation, especially for the purpose of segmentation or tracking, is typically not part of a radiologist's workflow, large and fully annotated datasets are a rare and scarce good. In this context, a variety of frameworks are proposed in this work to solve the problems that arise due to the lack of annotated training data across different medical imaging tasks and modalities. The first contribution as part of this thesis was to investigate weakly supervised learning on PET/CT data for the task of lesion segmentation. Using only class labels (tumor vs. no tumor), a classifier was first trained and subsequently used to generate Class Activation Maps highlighting regions with lesions. Based on these region proposals, final tumor segmentation could be performed with high accuracy in clinically relevant metrics. This drastically simplifies the process of training data generation, as only class labels have to be assigned to each slice of a scan instead of a full pixel-wise segmentation. To further reduce the time required to prepare training data, two self-supervised methods were investigated for the task of anatomical tissue segmentation and landmark detection. To this end, as a second contribution, a state-of-the-art tracking framework based on contrastive random walks was transferred, adapted and extended to the medical imaging domain. As contrastive learning often lacks real-time capability, a self-supervised template matching network was developed to address the task of real-time anatomical tissue tracking, yielding the third contribution of this work. Both of these methods have in common that only during inference the object or region of interest is defined, reducing the number of required labels to as few as one and allowing adaptation to different tasks without having to re-train or access the original training data. Despite the limited amount of labelled data, good results could be achieved for both tracking of organs across subjects as well as tissue tracking within time-series. State-of-the-art self-supervised learning in medical imaging is usually performed on 2D slices due to the lack of training data and limited computational resources. To exploit the three-dimensional structure of this type of data, self-supervised contrastive learning was performed on entire volumes using over 40,000 whole-body MRI scans forming the fourth contribution. Due to this pre-training, a large number of downstream tasks could be successfully addressed using only limited labelled data. Furthermore, the learned representations allows to visualize the entire dataset in a two-dimensional view. To encourage research in the field of automated lesion segmentation in PET/CT image data, the autoPET challenge was organized, which represents the fifth contribution

    Probabilistic spatial analysis in quantitative microscopy with uncertainty-aware cell detection using deep Bayesian regression

    Full text link
    The investigation of biological systems with three-dimensional microscopy demands automatic cell identification methods that not only are accurate but also can imply the uncertainty in their predictions. The use of deep learning to regress density maps is a popular successful approach for extracting cell coordinates from local peaks in a postprocessing step, which then, however, hinders any meaningful probabilistic output. We propose a framework that can operate on large microscopy images and output probabilistic predictions (i) by integrating deep Bayesian learning for the regression of uncertainty-aware density maps, where peak detection algorithms generate cell proposals, and (ii) by learning a mapping from prediction proposals to a probabilistic space that accurately represents the chances of a successful prediction. Using these calibrated predictions, we propose a probabilistic spatial analysis with Monte Carlo sampling. We demonstrate this in a bone marrow dataset, where our proposed methods reveal spatial patterns that are otherwise undetectable

    Towards autonomous diagnostic systems with medical imaging

    Get PDF
    Democratizing access to high quality healthcare has highlighted the need for autonomous diagnostic systems that a non-expert can use. Remote communities, first responders and even deep space explorers will come to rely on medical imaging systems that will provide them with Point of Care diagnostic capabilities. This thesis introduces the building blocks that would enable the creation of such a system. Firstly, we present a case study in order to further motivate the need and requirements of autonomous diagnostic systems. This case study primarily concerns deep space exploration where astronauts cannot rely on communication with earth-bound doctors to help them through diagnosis, nor can they make the trip back to earth for treatment. Requirements and possible solutions about the major challenges faced with such an application are discussed. Moreover, this work describes how a system can explore its perceived environment by developing a Multi Agent Reinforcement Learning method that allows for implicit communication between the agents. Under this regime agents can share the knowledge that benefits them all in achieving their individual tasks. Furthermore, we explore how systems can understand the 3D properties of 2D depicted objects in a probabilistic way. In Part II, this work explores how to reason about the extracted information in a causally enabled manner. A critical view on the applications of causality in medical imaging, and its potential uses is provided. It is then narrowed down to estimating possible future outcomes and reasoning about counterfactual outcomes by embedding data on a pseudo-Riemannian manifold and constraining the latent space by using the relativistic concept of light cones. By formalizing an approach to estimating counterfactuals, a computationally lighter alternative to the abduction-action-prediction paradigm is presented through the introduction of Deep Twin Networks. Appropriate partial identifiability constraints for categorical variables are derived and the method is applied in a series of medical tasks involving structured data, images and videos. All methods are evaluated in a wide array of synthetic and real life tasks that showcase their abilities, often achieving state-of-the-art performance or matching the existing best performance while requiring a fraction of the computational cost.Open Acces

    On-the-fly dense 3D surface reconstruction for geometry-aware augmented reality.

    Get PDF
    Augmented Reality (AR) is an emerging technology that makes seamless connections between virtual space and the real world by superimposing computer-generated information onto the real-world environment. AR can provide additional information in a more intuitive and natural way than any other information-delivery method that a human has ever in- vented. Camera tracking is the enabling technology for AR and has been well studied for the last few decades. Apart from the tracking problems, sensing and perception of the surrounding environment are also very important and challenging problems. Although there are existing hardware solutions such as Microsoft Kinect and HoloLens that can sense and build the environmental structure, they are either too bulky or too expensive for AR. In this thesis, the challenging real-time dense 3D surface reconstruction technologies are studied and reformulated for the reinvention of basic position-aware AR towards geometry-aware and the outlook of context- aware AR. We initially propose to reconstruct the dense environmental surface using the sparse point from Simultaneous Localisation and Map- ping (SLAM), but this approach is prone to fail in challenging Minimally Invasive Surgery (MIS) scenes such as the presence of deformation and surgical smoke. We subsequently adopt stereo vision with SLAM for more accurate and robust results. With the success of deep learning technology in recent years, we present learning based single image re- construction and achieve the state-of-the-art results. Moreover, we pro- posed context-aware AR, one step further from purely geometry-aware AR towards the high-level conceptual interaction modelling in complex AR environment for enhanced user experience. Finally, a learning-based smoke removal method is proposed to ensure an accurate and robust reconstruction under extreme conditions such as the presence of surgical smoke
    corecore