332 research outputs found

    Efficient Ultrasound Image Analysis Models with Sonographer Gaze Assisted Distillation.

    Get PDF
    Recent automated medical image analysis methods have attained state-of-the-art performance but have relied on memory and compute-intensive deep learning models. Reducing model size without significant loss in performance metrics is crucial for time and memory-efficient automated image-based decision-making. Traditional deep learning based image analysis only uses expert knowledge in the form of manual annotations. Recently, there has been interest in introducing other forms of expert knowledge into deep learning architecture design. This is the approach considered in the paper where we propose to combine ultrasound video with point-of-gaze tracked for expert sonographers as they scan to train memory-efficient ultrasound image analysis models. Specifically we develop teacher-student knowledge transfer models for the exemplar task of frame classification for the fetal abdomen, head, and femur. The best performing memory-efficient models attain performance within 5% of conventional models that are 1000Ă— larger in size

    Medical image analysis for simplified ultrasound protocols

    Get PDF
    Ultrasound is an imaging tool used in obstetrics to identify high-risk pregnancies. However, ultrasound (US) requires a trained operator, who guides a transducer in response to real-time interpretation of video content. In low- and middle-income countries (LMICs), there is a shortage of trained sonographers. In this thesis, we address this key challenge by combining simple US video sweeps with computational algorithms to provide clinical benefit. The sweeps can be taken by an US novice. First, we design an algorithm that automatically creates an assistive video overlay from a simple video sweep. The overlay assists interpretation of US video to assess placenta location. We describe the design and evaluation of a deep learning-based automatic segmentation model and a statistical data visualisation of 2-D placenta shapes. The data visualisation reveals the spectrum of placenta shapes in this problem space. A probabilistic graphical model is used to improve segmentations with regards to the highly variable placenta shape. From the automatic segmentations, image guidance is created, translating the clinical criteria into assistive visual information. Second, we explore analysis of multiple video sweeps using graphs. A three-node graph models three video sweeps, where the nodes encode binary sequences representing the fetal head frame-level detection across all video frames in a sweep. To better characterise the sweeps, we perform a statistical analysis of large-scale manual annotations of video sweeps in our dataset. This reveals common patterns of frame-level anatomy occurrence for different video sweep trajectories. Particular insight is gained for patterns that correspond to fetal pose. In this regard, we build a graph convolutional network to automatically classify fetal presentation, using graphs that combine complementary video sweep information relating to fetal pose. Finally, we demonstrate the feasibility of placenta 3-D reconstruction using multiple video sweeps. We pose this challenging problem as spatio-temporal alignment of US video. We first temporally align video sweeps to represent video content at the same temporal scale. Then, we use affine transformations to spatially align images in temporally aligned video. The results in this chapter are exciting as they show the feasibility of placenta 3-D reconstruction in a simple US sweep system

    Sonography data science

    Get PDF
    Fetal sonography remains a highly specialised skill in spite of its necessity and importance. Because of differences in fetal and maternal anatomy, and human pyschomotor skills, there is an intra- and inter-sonographer variability amoungst expert sonographers. By understanding their similarities and differences, we want to build more interpretive models to assist a sonographer who is less experienced in scanning. This thesis’s contributions to the field of fetal sonography can be grouped into two themes. First I have used data visualisation and machine learning methods to show that a sonographer’s search strategy is anatomical (plane) dependent. Second, I show that a sonographer’s style and human skill of scanning is not easily disentangled. We first examine task-specific spatio-temporal gaze behaviour through the use of data visualisation, where a task is defined as a specific anatomical plane the sonographer is searching for. The qualitative analysis is performed at both a population and individual level, where we show that the task being performed determines the sonographer’s gaze behaviour. In our population-level analysis, we use unsupervised methods to identify meaningful gaze patterns and visualise task-level differences. In our individual-level analysis, we use a deep learning model to provide context to the eye-tracking data with respect to the ultrasound image. We then use an event-based visualisation to understand differences between gaze patterns of sonographers performing the same task. In some instances, sonographers adopt a different search strategy which is seen in the misclassified instances of an eye-tracking task classification model. Our task classification model supports the qualitative behaviour seen in our population-level analysis, where task-specific gaze behaviour is quantitatively distinct. We also investigate the use of time-based skill definitions and their appropriateness in fetal ultrasound sonography; a time-based skill definition uses years of clinical experience as an indicator of skill. The developed task-agnostic skill classification model differentiates gaze behaviour between sonographers in training and fully qualified sonographers. The preliminary results also show that fetal sonography scanning remains an operator-dependent skill, where the notion of human skill and individual scanning stylistic differences cannot be easily disentangled. Our work demonstrates how and where sonographers look at whilst scanning, which can be used as a stepping stone for building style-agnostic skill models

    FPUS23: An Ultrasound Fetus Phantom Dataset with Deep Neural Network Evaluations for Fetus Orientations, Fetal Planes, and Anatomical Features

    Full text link
    Ultrasound imaging is one of the most prominent technologies to evaluate the growth, progression, and overall health of a fetus during its gestation. However, the interpretation of the data obtained from such studies is best left to expert physicians and technicians who are trained and well-versed in analyzing such images. To improve the clinical workflow and potentially develop an at-home ultrasound-based fetal monitoring platform, we present a novel fetus phantom ultrasound dataset, FPUS23, which can be used to identify (1) the correct diagnostic planes for estimating fetal biometric values, (2) fetus orientation, (3) their anatomical features, and (4) bounding boxes of the fetus phantom anatomies at 23 weeks gestation. The entire dataset is composed of 15,728 images, which are used to train four different Deep Neural Network models, built upon a ResNet34 backbone, for detecting aforementioned fetus features and use-cases. We have also evaluated the models trained using our FPUS23 dataset, to show that the information learned by these models can be used to substantially increase the accuracy on real-world ultrasound fetus datasets. We make the FPUS23 dataset and the pre-trained models publicly accessible at https://github.com/bharathprabakaran/FPUS23, which will further facilitate future research on fetal ultrasound imaging and analysis

    COMFormer: classification of maternal-fetal and brain anatomy using a residual cross-covariance attention guided transformer in ultrasound

    Get PDF
    Monitoring the healthy development of a fetus requires accurate and timely identification of different maternal-fetal structures as they grow. To facilitate this objective in an automated fashion, we propose a deep-learning-based image classification architecture called the COMFormer to classify maternal-fetal and brain anatomical structures present in two-dimensional fetal ultrasound images. The proposed architecture classifies the two subcategories separately: maternal-fetal (abdomen, brain, femur, thorax, mother's cervix, and others) and brain anatomical structures (trans-thalamic, trans-cerebellum, trans-ventricular, and non-brain). Our proposed architecture relies on a transformer-based approach that leverages spatial and global features by using a newly designed residual cross-variance attention (R-XCA) block. This block introduces an advanced cross-covariance attention mechanism to capture a long-range representation from the input using spatial (e.g., shape, texture, intensity) and global features. To build COMFormer, we used a large publicly available dataset (BCNatal) consisting of 12, 400 images from 1,792 subjects. Experimental results prove that COMFormer outperforms the recent CNN and transformer-based models by achieving 95.64% and 96.33% classification accuracy on maternal-fetal and brain anatomy, respectively

    Doctor of Philosophy

    Get PDF
    dissertationCongenital heart defects are classes of birth defects that affect the structure and function of the heart. These defects are attributed to the abnormal or incomplete development of a fetal heart during the first few weeks following conception. The overall detection rate of congenital heart defects during routine prenatal examination is low. This is attributed to the insufficient number of trained personnel in many local health centers where many cases of congenital heart defects go undetected. This dissertation presents a system to identify congenital heart defects to improve pregnancy outcomes and increase their detection rates. The system was developed and its performance assessed in identifying the presence of ventricular defects (congenital heart defects that affect the size of the ventricles) using four-dimensional fetal chocardiographic images. The designed system consists of three components: 1) a fetal heart location estimation component, 2) a fetal heart chamber segmentation component, and 3) a detection component that detects congenital heart defects from the segmented chambers. The location estimation component is used to isolate a fetal heart in any four-dimensional fetal echocardiographic image. It uses a hybrid region of interest extraction method that is robust to speckle noise degradation inherent in all ultrasound images. The location estimation method's performance was analyzed on 130 four-dimensional fetal echocardiographic images by comparison with manually identified fetal heart region of interest. The location estimation method showed good agreement with the manually identified standard using four quantitative indexes: Jaccard index, Sørenson-Dice index, Sensitivity index and Specificity index. The average values of these indexes were measured at 80.70%, 89.19%, 91.04%, and 99.17%, respectively. The fetal heart chamber segmentation component uses velocity vector field estimates computed on frames contained in a four-dimensional image to identify the fetal heart chambers. The velocity vector fields are computed using a histogram-based optical flow technique which is formulated on local image characteristics to reduces the effect of speckle noise and nonuniform echogenicity on the velocity vector field estimates. Features based on the velocity vector field estimates, voxel brightness/intensity values, and voxel Cartesian coordinate positions were extracted and used with kernel k-means algorithm to identify the individual chambers. The segmentation method's performance was evaluated on 130 images from 31 patients by comparing the segmentation results with manually identified fetal heart chambers. Evaluation was based on the Sørenson-Dice index, the absolute volume difference and the Hausdorff distance, with each resulting in per patient average values of 69.92%, 22.08%, and 2.82 mm, respectively. The detection component uses the volumes of the identified fetal heart chambers to flag the possible occurrence of hypoplastic left heart syndrome, a type of congenital heart defect. An empirical volume threshold defined on the relative ratio of adjacent fetal heart chamber volumes obtained manually is used in the detection process. The performance of the detection procedure was assessed by comparison with a set of images with confirmed diagnosis of hypoplastic left heart syndrome and a control group of normal fetal hearts. Of the 130 images considered 18 of 20 (90%) fetal hearts were correctly detected as having hypoplastic left heart syndrome and 84 of 110 (76.36%) fetal hearts were correctly detected as normal in the control group. The results show that the detection system performs better than the overall detection rate for congenital heart defect which is reported to be between 30% and 60%

    Real-time Ultrasound Signals Processing: Denoising and Super-resolution

    Get PDF
    Ultrasound acquisition is widespread in the biomedical field, due to its properties of low cost, portability, and non-invasiveness for the patient. The processing and analysis of US signals, such as images, 2D videos, and volumetric images, allows the physician to monitor the evolution of the patient's disease, and support diagnosis, and treatments (e.g., surgery). US images are affected by speckle noise, generated by the overlap of US waves. Furthermore, low-resolution images are acquired when a high acquisition frequency is applied to accurately characterise the behaviour of anatomical features that quickly change over time. Denoising and super-resolution of US signals are relevant to improve the visual evaluation of the physician and the performance and accuracy of processing methods, such as segmentation and classification. The main requirements for the processing and analysis of US signals are real-time execution, preservation of anatomical features, and reduction of artefacts. In this context, we present a novel framework for the real-time denoising of US 2D images based on deep learning and high-performance computing, which reduces noise while preserving anatomical features in real-time execution. We extend our framework to the denoise of arbitrary US signals, such as 2D videos and 3D images, and we apply denoising algorithms that account for spatio-temporal signal properties into an image-to-image deep learning model. As a building block of this framework, we propose a novel denoising method belonging to the class of low-rank approximations, which learns and predicts the optimal thresholds of the Singular Value Decomposition. While previous denoise work compromises the computational cost and effectiveness of the method, the proposed framework achieves the results of the best denoising algorithms in terms of noise removal, anatomical feature preservation, and geometric and texture properties conservation, in a real-time execution that respects industrial constraints. The framework reduces the artefacts (e.g., blurring) and preserves the spatio-temporal consistency among frames/slices; also, it is general to the denoising algorithm, anatomical district, and noise intensity. Then, we introduce a novel framework for the real-time reconstruction of the non-acquired scan lines through an interpolating method; a deep learning model improves the results of the interpolation to match the target image (i.e., the high-resolution image). We improve the accuracy of the prediction of the reconstructed lines through the design of the network architecture and the loss function. %The design of the deep learning architecture and the loss function allow the network to improve the accuracy of the prediction of the reconstructed lines. In the context of signal approximation, we introduce our kernel-based sampling method for the reconstruction of 2D and 3D signals defined on regular and irregular grids, with an application to US 2D and 3D images. Our method improves previous work in terms of sampling quality, approximation accuracy, and geometry reconstruction with a slightly higher computational cost. For both denoising and super-resolution, we evaluate the compliance with the real-time requirement of US applications in the medical domain and provide a quantitative evaluation of denoising and super-resolution methods on US and synthetic images. Finally, we discuss the role of denoising and super-resolution as pre-processing steps for segmentation and predictive analysis of breast pathologies
    • …
    corecore