40 research outputs found

    Landmark Tracking in Liver US images Using Cascade Convolutional Neural Networks with Long Short-Term Memory

    Full text link
    This study proposed a deep learning-based tracking method for ultrasound (US) image-guided radiation therapy. The proposed cascade deep learning model is composed of an attention network, a mask region-based convolutional neural network (mask R-CNN), and a long short-term memory (LSTM) network. The attention network learns a mapping from a US image to a suspected area of landmark motion in order to reduce the search region. The mask R-CNN then produces multiple region-of-interest (ROI) proposals in the reduced region and identifies the proposed landmark via three network heads: bounding box regression, proposal classification, and landmark segmentation. The LSTM network models the temporal relationship among the successive image frames for bounding box regression and proposal classification. To consolidate the final proposal, a selection method is designed according to the similarities between sequential frames. The proposed method was tested on the liver US tracking datasets used in the Medical Image Computing and Computer Assisted Interventions (MICCAI) 2015 challenges, where the landmarks were annotated by three experienced observers to obtain their mean positions. Five-fold cross-validation on the 24 given US sequences with ground truths shows that the mean tracking error for all landmarks is 0.65+/-0.56 mm, and the errors of all landmarks are within 2 mm. We further tested the proposed model on 69 landmarks from the testing dataset that has a similar image pattern to the training pattern, resulting in a mean tracking error of 0.94+/-0.83 mm. Our experimental results have demonstrated the feasibility and accuracy of our proposed method in tracking liver anatomic landmarks using US images, providing a potential solution for real-time liver tracking for active motion management during radiation therapy

    Learning Tissue Geometries for Photoacoustic Image Analysis

    Get PDF
    Photoacoustic imaging (PAI) holds great promise as a novel, non-ionizing imaging modality, allowing insight into both morphological and physiological tissue properties, which are of particular importance in the diagnostics and therapy of various diseases, such as cancer and cardiovascular diseases. However, the estimation of physiological tissue properties with PAI requires the solution of two inverse problems, one of which, in particular, presents challenges in the form of inherent high dimensionality, potential ill-posedness, and non-linearity. Deep learning (DL) approaches show great potential to address these challenges but typically rely on simulated training data providing ground truth labels, as there are no gold standard methods to infer physiological properties in vivo. The current domain gap between simulated and real photoacoustic (PA) images results in poor in vivo performance and a lack of reliability of models trained with simulated data. Consequently, the estimates of these models occasionally fail to match clinical expectations. The work conducted within the scope of this thesis aimed to improve the applicability of DL approaches to PAI-based tissue parameter estimation by systematically exploring novel data-driven methods to enhance the realism of PA simulations (learning-to-simulate). This thesis is part of a larger research effort, where different factors contributing to PA image formation are disentangled and individually approached with data-driven methods. The specific research focus was placed on generating tissue geometries covering a variety of different tissue types and morphologies, which represent a key component in most PA simulation approaches. Based on in vivo PA measurements (N = 288) obtained in a healthy volunteer study, three data-driven methods were investigated leveraging (1) semantic segmentation, (2) Generative Adversarial Networks (GANs), and (3) scene graphs that encode prior knowledge about the general tissue composition of an image, respectively. The feasibility of all three approaches was successfully demonstrated. First, as a basis for the more advanced approaches, it was shown that tissue geometries can be automatically extracted from PA images through the use of semantic segmentation with two types of discriminative networks and supervised training with manual reference annotations. While this method may replace manual annotation in the future, it does not allow the generation of any number of tissue geometries. In contrast, the GAN-based approach constitutes a generative model that allows the generation of new tissue geometries that closely follow the training data distribution. The plausibility of the generated geometries was successfully demonstrated in a comparative assessment of the performance of a downstream quantification task. A generative model based on scene graphs was developed to gain a deeper understanding of important underlying geometric quantities. Unlike the GAN-based approach, it incorporates prior knowledge about the hierarchical composition of the modeled scene. However, it allowed the generation of plausible tissue geometries and, in parallel, the explicit matching of the distributions of the generated and the target geometric quantities. The training was performed either in analogy to the GAN approach, with target reference annotations, or directly with target PA images, circumventing the need for annotations. While this approach has so far been exclusively conducted in silico, its inherent versatility presents a compelling prospect for the generation of tissue geometries with in vivo reference PA images. In summary, each of the three approaches for generating tissue geometry exhibits distinct strengths and limitations, making their suitability contingent upon the specific application at hand. By opening a new research direction in the form of learning-to-simulate approaches and significantly improving the realistic modeling of tissue geometries and, thus, ultimately, PA simulations, this work lays a crucial foundation for the future use of DL-based quantitative PAI in the clinical setting

    Interpretable Deep Learning for Discriminating Pneumonia from Lung Ultrasounds

    Get PDF
    Lung ultrasound images have shown great promise to be an operative point-of-care test for the diagnosis of COVID-19 because of the ease of procedure with negligible individual protection equipment, together with relaxed disinfection. Deep learning (DL) is a robust tool for modeling infection patterns from medical images; however, the existing COVID-19 detection models are complex and thereby are hard to deploy in frequently used mobile platforms in point-of-care testing. Moreover, most of the COVID-19 detection models in the existing literature on DL are implemented as a black box, hence, they are hard to be interpreted or trusted by the healthcare community. This paper presents a novel interpretable DL framework discriminating COVID-19 infection from other cases of pneumonia and normal cases using ultrasound data of patients. In the proposed framework, novel transformer modules are introduced to model the pathological information from ultrasound frames using an improved window-based multi-head self-attention layer. A convolutional patching module is introduced to transform input frames into latent space rather than partitioning input into patches. A weighted pooling module is presented to score the embeddings of the disease representations obtained from the transformer modules to attend to information that is most valuable for the screening decision. Experimental analysis of the public three-class lung ultrasound dataset (PCUS dataset) demonstrates the discriminative power (Accuracy: 93.4%, F1-score: 93.1%, AUC: 97.5%) of the proposed solution overcoming the competing approaches while maintaining low complexity. The proposed model obtained very promising results in comparison with the rival models. More importantly, it gives explainable outputs therefore, it can serve as a candidate tool for empowering the sustainable diagnosis of COVID-19-like diseases in smart healthcare

    Data-driven reconstruction methods for photoacoustic tomography:Learning structures by structured learning

    Get PDF
    Photoacoustic tomography (PAT) is an imaging technique with potential applications in various fields of biomedicine. By visualising vascular structures, PAT could help in the detection and diagnosis of diseases related to their dysregulation. In PAT, tissue is illuminated by light. After entering the tissue, the light undergoes scattering and absorption. The absorbed energy is transformed into an initial pressure by the photoacoustic effect, which travels to ultrasound detectors outside the tissue.This thesis is concerned with the inverse problem of the described physical process: what was the initial pressure in the tissue that gave rise to the detected pressure outside? The answer to this question is difficult to obtain when light penetration in tissue is not sufficient, the measurements are corrupted, or only a small number of detectors can be used in a limited geometry. For decades, the field of variational methods has come up with new approaches to solve these kind of problems. these kind of problems: the combination of new theory and clever algorithms has led to improved numerical results in many image reconstruction problems. In the past five years, previously state-of-the-art results were greatly surpassed by combining variational methods with artificial neural networks, a form of artificial intelligence.In this thesis we investigate several ways of combining data-driven artificial neural networks with model-driven variational methods. We combine the topics of photoacoustic tomography, inverse problems and artificial neural networks.Chapter 3 treats the variational problem in PAT and provides a framework in which hand-crafted regularisers can easily be compared. Both directional and higher-order total variation methods show improved results over direct methods for PAT with structures resembling vasculature.Chapter 4 provides a method to jointly solve the PAT reconstruction and segmentation problem for absorbing structures resembling vasculature. Artificial neural networks are embodied in the algorithmic structure of primal-dual methods, which are a popular way to solve variational problems. It is shown that a diverse training set is of utmost importance to solve multiple problems with one learned algorithm.Chapter 5 provides a convergence analysis for data-consistent networks, which combine classical regularisation methods with artificial neural networks. Numerical results are shown for an inverse problem that couples the Radon transform with a saturation problem for biomedical images.Chapter 6 explores the idea of fully-learned reconstruction by connecting two nonlinear autoencoders. By enforcing a dimensionality reduction in the artificial neural network, a joint manifold for measurements and images is learned. The method, coined learned SVD, provides advantages over other fully-learned methods in terms of interpretability and generalisation. Numerical results show high-quality reconstructions, even in the case where no information on the forward process is used.In this thesis, several ways of combining model-based methods with data-driven artificial neural networks were investigated. The resulting hybrid methods showed improved tomography reconstructions. By allowing data to improve a structured method, deeper vascular structures could be imaged with photoacoustic tomography.<br/

    Deep learning in food category recognition

    Get PDF
    Integrating artificial intelligence with food category recognition has been a field of interest for research for the past few decades. It is potentially one of the next steps in revolutionizing human interaction with food. The modern advent of big data and the development of data-oriented fields like deep learning have provided advancements in food category recognition. With increasing computational power and ever-larger food datasets, the approach’s potential has yet to be realized. This survey provides an overview of methods that can be applied to various food category recognition tasks, including detecting type, ingredients, quality, and quantity. We survey the core components for constructing a machine learning system for food category recognition, including datasets, data augmentation, hand-crafted feature extraction, and machine learning algorithms. We place a particular focus on the field of deep learning, including the utilization of convolutional neural networks, transfer learning, and semi-supervised learning. We provide an overview of relevant studies to promote further developments in food category recognition for research and industrial applicationsMRC (MC_PC_17171)Royal Society (RP202G0230)BHF (AA/18/3/34220)Hope Foundation for Cancer Research (RM60G0680)GCRF (P202PF11)Sino-UK Industrial Fund (RP202G0289)LIAS (P202ED10Data Science Enhancement Fund (P202RE237)Fight for Sight (24NN201);Sino-UK Education Fund (OP202006)BBSRC (RM32G0178B8

    Complexity Reduction in Image-Based Breast Cancer Care

    Get PDF
    The diversity of malignancies of the breast requires personalized diagnostic and therapeutic decision making in a complex situation. This thesis contributes in three clinical areas: (1) For clinical diagnostic image evaluation, computer-aided detection and diagnosis of mass and non-mass lesions in breast MRI is developed. 4D texture features characterize mass lesions. For non-mass lesions, a combined detection/characterisation method utilizes the bilateral symmetry of the breast s contrast agent uptake. (2) To improve clinical workflows, a breast MRI reading paradigm is proposed, exemplified by a breast MRI reading workstation prototype. Instead of mouse and keyboard, it is operated using multi-touch gestures. The concept is extended to mammography screening, introducing efficient navigation aids. (3) Contributions to finite element modeling of breast tissue deformations tackle two clinical problems: surgery planning and the prediction of the breast deformation in a MRI biopsy device

    New approaches for unsupervised transcriptomic data analysis based on Dictionary learning

    Get PDF
    The era of high-throughput data generation enables new access to biomolecular profiles and exploitation thereof. However, the analysis of such biomolecular data, for example, transcriptomic data, suffers from the so-called "curse of dimensionality". This occurs in the analysis of datasets with a significantly larger number of variables than data points. As a consequence, overfitting and unintentional learning of process-independent patterns can appear. This can lead to insignificant results in the application. A common way of counteracting this problem is the application of dimension reduction methods and subsequent analysis of the resulting low-dimensional representation that has a smaller number of variables. In this thesis, two new methods for the analysis of transcriptomic datasets are introduced and evaluated. Our methods are based on the concepts of Dictionary learning, which is an unsupervised dimension reduction approach. Unlike many dimension reduction approaches that are widely applied for transcriptomic data analysis, Dictionary learning does not impose constraints on the components that are to be derived. This allows for great flexibility when adjusting the representation to the data. Further, Dictionary learning belongs to the class of sparse methods. The result of sparse methods is a model with few non-zero coefficients, which is often preferred for its simplicity and ease of interpretation. Sparse methods exploit the fact that the analysed datasets are highly structured. Indeed, a characteristic of transcriptomic data is particularly their structuredness, which appears due to the connection of genes and pathways, for example. Nonetheless, the application of Dictionary learning in medical data analysis is mainly restricted to image analysis. Another advantage of Dictionary learning is that it is an interpretable approach. Interpretability is a necessity in biomolecular data analysis to gain a holistic understanding of the investigated processes. Our two new transcriptomic data analysis methods are each designed for one main task: (1) identification of subgroups for samples from mixed populations, and (2) temporal ordering of samples from dynamic datasets, also referred to as "pseudotime estimation". Both methods are evaluated on simulated and real-world data and compared to other methods that are widely applied in transcriptomic data analysis. Our methods convince through high performance and overall outperform the comparison methods
    corecore