809 research outputs found

    Mask-R 2 CNN: a distance-field regression version of Mask-RCNN for fetal-head delineation in ultrasound images

    Get PDF
    Background and objectives: Fetal head-circumference (HC) measurement from ultrasound (US) images provides useful hints for assessing fetal growth. Such measurement is performed manually during the actual clinical practice, posing issues relevant to intra- and inter-clinician variability. This work presents a fully automatic, deep-learning-based approach to HC delineation, which we named Mask-R2CNN. It advances our previous work in the field and performs HC distance-field regression in an end-to-end fashion, without requiring a priori HC localization nor any postprocessing for outlier removal. Methods: Mask-R2CNN follows the Mask-RCNN architecture, with a backbone inspired by feature-pyramid networks, a region-proposal network and the ROI align. The Mask-RCNN segmentation head is here modified to regress the HC distance field. Results: Mask-R2CNN was tested on the HC18 Challenge dataset, which consists of 999 training and 335 testing images. With a comprehensive ablation study, we showed that Mask-R2CNN achieved a mean absolute difference of 1.95 mm (standard deviation = ± 1.92 mm), outperforming other approaches in the literature. Conclusions: With this work, we proposed an end-to-end model for HC distance-field regression. With our experimental results, we showed that Mask-R2CNN may be an effective support for clinicians for assessing fetal growth

    Preterm Infants' Pose Estimation with Spatio-Temporal Features

    Get PDF
    Objective: Preterm infants' limb monitoring in neonatal intensive care units (NICUs) is of primary importance for assessing infants' health status and motor/cognitive development. Herein, we propose a new approach to preterm infants' limb pose estimation that features spatio-temporal information to detect and track limb joints from depth videos with high reliability. Methods: Limb-pose estimation is performed using a deep-learning framework consisting of a detection and a regression convolutional neural network (CNN) for rough and precise joint localization, respectively. The CNNs are implemented to encode connectivity in the temporal direction through 3D convolution. Assessment of the proposed framework is performed through a comprehensive study with sixteen depth videos acquired in the actual clinical practice from sixteen preterm infants (the babyPose dataset). Results: When applied to pose estimation, the median root mean square distance, computed among all limbs, between the estimated and the ground-truth pose was 9.06 pixels, overcoming approaches based on spatial features only (11.27 pixels). Conclusion: Results showed that the spatio-temporal features had a significant influence on the pose-estimation performance, especially in challenging cases (e.g., homogeneous image intensity). Significance: This article significantly enhances the state of art in automatic assessment of preterm infants' health status by introducing the use of spatio-temporal features for limb detection and tracking, and by being the first study to use depth videos acquired in the actual clinical practice for limb-pose estimation. The babyPose dataset has been released as the first annotated dataset for infants' pose estimation

    Preterm infants' limb-pose estimation from depth images using convolutional neural networks

    Get PDF
    Preterm infants' limb-pose estimation is a crucial but challenging task, which may improve patients' care and facilitate clinicians in infant's movements monitoring. Work in the literature either provides approaches to whole-body segmentation and tracking, which, however, has poor clinical value, or retrieve a posteriori limb pose from limb segmentation, increasing computational costs and introducing inaccuracy sources. In this paper, we address the problem of limb-pose estimation under a different point of view. We proposed a 2D fully-convolutional neural network for roughly detecting limb joints and joint connections, followed by a regression convolutional neural network for accurate joint and joint-connection position estimation. Joints from the same limb are then connected with a maximum bipartite matching approach. Our analysis does not require any prior modeling of infants' body structure, neither any manual interventions. For developing and testing the proposed approach, we built a dataset of four videos (video length = 90 s) recorded with a depth sensor in a neonatal intensive care unit (NICU) during the actual clinical practice, achieving median root mean square distance [pixels] of 10.790 (right arm), 10.542 (left arm), 8.294 (right leg), 11.270 (left leg) with respect to the ground-truth limb pose. The idea of estimating limb pose directly from depth images may represent a future paradigm for addressing the problem of preterm-infants' movement monitoring and offer all possible support to clinicians in NICUs

    Deep Learning Based Robotic Tool Detection and Articulation Estimation with Spatio-Temporal Layers

    Get PDF
    Surgical-tool joint detection from laparoscopic images is an important but challenging task in computer-assisted minimally invasive surgery. Illumination levels, variations in background and the different number of tools in the field of view, all pose difficulties to algorithm and model training. Yet, such challenges could be potentially tackled by exploiting the temporal information in laparoscopic videos to avoid per frame handling of the problem. In this letter, we propose a novel encoder-decoder architecture for surgical instrument joint detection and localization that uses three-dimensional convolutional layers to exploit spatio-temporal features from laparoscopic videos. When tested on benchmark and custom-built datasets, a median Dice similarity coefficient of 85.1% with an interquartile range of 4.6% highlights performance better than the state of the art based on single-frame processing. Alongside novelty of the network architecture, the idea for inclusion of temporal information appears to be particularly useful when processing images with unseen backgrounds during the training phase, which indicates that spatio-temporal features for joint detection help to generalize the solution

    Supervised cnn strategies for optical image segmentation and classification in interventional medicine

    Get PDF
    The analysis of interventional images is a topic of high interest for the medical-image analysis community. Such an analysis may provide interventional-medicine professionals with both decision support and context awareness, with the final goal of improving patient safety. The aim of this chapter is to give an overview of some of the most recent approaches (up to 2018) in the field, with a focus on Convolutional Neural Networks (CNNs) for both segmentation and classification tasks. For each approach, summary tables are presented reporting the used dataset, involved anatomical region and achieved performance. Benefits and disadvantages of each approach are highlighted and discussed. Available datasets for algorithm training and testing and commonly used performance metrics are summarized to offer a source of information for researchers that are approaching the field of interventional-image analysis. The advancements in deep learning for medical-image analysis are involving more and more the interventional-medicine field. However, these advancements are undeniably slower than in other fields (e.g. preoperative-image analysis) and considerable work still needs to be done in order to provide clinicians with all possible support during interventional-medicine procedures

    Learning-based screening of endothelial dysfunction from photoplethysmographic signals

    Get PDF
    Endothelial-Dysfunction (ED) screening is of primary importance to early diagnosis cardiovascular diseases. Recently, approaches to ED screening are focusing more and more on photoplethysmography (PPG)-signal analysis, which is performed in a threshold-sensitive way and may not be suitable for tackling the high variability of PPG signals. The goal of this work was to present an innovative machine-learning (ML) approach to ED screening that could tackle such variability. Two research hypotheses guided this work: (H1) ML can support ED screening by classifying PPG features; and (H2) classification performance can be improved when including also anthropometric features. To investigate H1 and H2, a new dataset was built from 59 subject. The dataset is balanced in terms of subjects with and without ED. Support vector machine (SVM), random forest (RF) and k-nearest neighbors (KNN) classifiers were investigated for feature classification. With the leave-one-out evaluation protocol, the best classification results for H1 were obtained with SVM (accuracy = 71%, recall = 59%). When testing H2, the recall was further improved to 67%. Such results are a promising step for developing a novel and intelligent PPG device to assist clinicians in performing large scale and low cost ED screening

    Learned and handcrafted features for early-stage laryngeal SCC diagnosis

    Get PDF
    Squamous cell carcinoma (SCC) is the most common and malignant laryngeal cancer. An early-stage diagnosis is of crucial importance to lower patient mortality and preserve both the laryngeal anatomy and vocal-fold function. However, this may be challenging as the initial larynx modifications, mainly concerning the mucosa vascular tree and the epithelium texture and color, are small and can pass unnoticed to the human eye. The primary goal of this paper was to investigate a learning-based approach to early-stage SCC diagnosis, and compare the use of (i) texture-based global descriptors, such as local binary patterns, and (ii) deep-learning-based descriptors. These features, extracted from endoscopic narrow-band images of the larynx, were classified with support vector machines as to discriminate healthy, precancerous, and early-stage SCC tissues. When tested on a benchmark dataset, a median classification recall of 98% was obtained with the best feature combination, outperforming the state of the art (recall = 95%). Despite further investigation is needed (e.g., testing on a larger dataset), the achieved results support the use of the developed methodology in the actual clinical practice to provide accurate early-stage SCC diagnosis. [Figure not available: see fulltext.]

    Uncertainty-Aware Organ Classification for Surgical Data Science Applications in Laparoscopy

    Get PDF
    Objective: Surgical data science is evolving into a research field that aims to observe everything occurring within and around the treatment process to provide situation-aware data-driven assistance. In the context of endoscopic video analysis, the accurate classification of organs in the field of view of the camera proffers a technical challenge. Herein, we propose a new approach to anatomical structure classification and image tagging that features an intrinsic measure of confidence to estimate its own performance with high reliability and which can be applied to both RGB and multispectral imaging (MI) data. Methods: Organ recognition is performed using a superpixel classification strategy based on textural and reflectance information. Classification confidence is estimated by analyzing the dispersion of class probabilities. Assessment of the proposed technology is performed through a comprehensive in vivo study with seven pigs. Results: When applied to image tagging, mean accuracy in our experiments increased from 65% (RGB) and 80% (MI) to 90% (RGB) and 96% (MI) with the confidence measure. Conclusion: Results showed that the confidence measure had a significant influence on the classification accuracy, and MI data are better suited for anatomical structure labeling than RGB data. Significance: This work significantly enhances the state of art in automatic labeling of endoscopic videos by introducing the use of the confidence metric, and by being the first study to use MI data for in vivo laparoscopic tissue classification. The data of our experiments will be released as the first in vivo MI dataset upon publication of this paper.Comment: 7 pages, 6 images, 2 table

    The babyPose dataset

    Get PDF
    none5noThe database here described contains data relevant to preterm infants' movement acquired in neonatal intensive care units (NICUs). The data consists of 16 depth videos recorded during the actual clinical practice. Each video consists of 1000 frames (i.e., 100s). The dataset was acquired at the NICU of the Salesi Hospital, Ancona (Italy). Each frame was annotated with the limb-joint location. Twelve joints were annotated, i.e., left and right shoul- der, elbow, wrist, hip, knee and ankle. The database is freely accessible at http://doi.org/10.5281/zenodo.3891404. This dataset represents a unique resource for artificial intelligence researchers that want to develop algorithms to provide healthcare professionals working in NICUs with decision support. Hence, the babyPose dataset is the first annotated dataset of depth images relevant to preterm infants' movement analysis.openMigliorelli L.; Moccia S.; Pietrini R.; Carnielli V.P.; Frontoni E.Migliorelli, L.; Moccia, S.; Pietrini, R.; Carnielli, V. P.; Frontoni, E

    Towards realistic laparoscopic image generation using image-domain translation

    Get PDF
    Background and ObjectivesOver the last decade, Deep Learning (DL) has revolutionized data analysis in many areas, including medical imaging. However, there is a bottleneck in the advancement of DL in the surgery field, which can be seen in a shortage of large-scale data, which in turn may be attributed to the lack of a structured and standardized methodology for storing and analyzing surgical images in clinical centres. Furthermore, accurate annotations manually added are expensive and time consuming. A great help can come from the synthesis of artificial images; in this context, in the latest years, the use of Generative Adversarial Neural Networks (GANs) achieved promising results in obtaining photo-realistic images. MethodsIn this study, a method for Minimally Invasive Surgery (MIS) image synthesis is proposed. To this aim, the generative adversarial network pix2pix is trained to generate paired annotated MIS images by transforming rough segmentation of surgical instruments and tissues into realistic images. An additional regularization term was added to the original optimization problem, in order to enhance realism of surgical tools with respect to the background. Results Quantitative and qualitative (i.e., human-based) evaluations of generated images have been carried out in order to assess the effectiveness of the method. ConclusionsExperimental results show that the proposed method is actually able to translate MIS segmentations to realistic MIS images, which can in turn be used to augment existing data sets and help at overcoming the lack of useful images; this allows physicians and algorithms to take advantage from new annotated instances for their training
    corecore