11 research outputs found
Eye Tracking for Tele-robotic Surgery: A Comparative Evaluation of Head-worn Solutions
Purpose: Metrics derived from eye-gaze-tracking and pupillometry show promise
for cognitive load assessment, potentially enhancing training and patient
safety through user-specific feedback in tele-robotic surgery. However, current
eye-tracking solutions' effectiveness in tele-robotic surgery is uncertain
compared to everyday situations due to close-range interactions causing extreme
pupil angles and occlusions. To assess the effectiveness of modern
eye-gaze-tracking solutions in tele-robotic surgery, we compare the Tobii Pro 3
Glasses and Pupil Labs Core, evaluating their pupil diameter and gaze stability
when integrated with the da Vinci Research Kit (dVRK). Methods: The study
protocol includes a nine-point gaze calibration followed by pick-and-place task
using the dVRK and is repeated three times. After a final calibration, users
view a 3x3 grid of AprilTags, focusing on each marker for 10 seconds, to
evaluate gaze stability across dVRK-screen positions with the L2-norm.
Different gaze calibrations assess calibration's temporal deterioration due to
head movements. Pupil diameter stability is evaluated using the FFT from the
pupil diameter during the pick-and-place tasks. Users perform this routine with
both head-worn eye-tracking systems. Results: Data collected from ten users
indicate comparable pupil diameter stability. FFTs of pupil diameters show
similar amplitudes in high-frequency components. Tobii Glasses show more
temporal gaze stability compared to Pupil Labs, though both eye trackers yield
a similar 4cm error in gaze estimation without an outdated calibration.
Conclusion: Both eye trackers demonstrate similar stability of the pupil
diameter and gaze, when the calibration is not outdated, indicating comparable
eye-tracking and pupillometry performance in tele-robotic surgery settings
A Quantitative Evaluation of Dense 3D Reconstruction of Sinus Anatomy from Monocular Endoscopic Video
Generating accurate 3D reconstructions from endoscopic video is a promising
avenue for longitudinal radiation-free analysis of sinus anatomy and surgical
outcomes. Several methods for monocular reconstruction have been proposed,
yielding visually pleasant 3D anatomical structures by retrieving relative
camera poses with structure-from-motion-type algorithms and fusion of monocular
depth estimates. However, due to the complex properties of the underlying
algorithms and endoscopic scenes, the reconstruction pipeline may perform
poorly or fail unexpectedly. Further, acquiring medical data conveys additional
challenges, presenting difficulties in quantitatively benchmarking these
models, understanding failure cases, and identifying critical components that
contribute to their precision. In this work, we perform a quantitative analysis
of a self-supervised approach for sinus reconstruction using endoscopic
sequences paired with optical tracking and high-resolution computed tomography
acquired from nine ex-vivo specimens. Our results show that the generated
reconstructions are in high agreement with the anatomy, yielding an average
point-to-mesh error of 0.91 mm between reconstructions and CT segmentations.
However, in a point-to-point matching scenario, relevant for endoscope tracking
and navigation, we found average target registration errors of 6.58 mm. We
identified that pose and depth estimation inaccuracies contribute equally to
this error and that locally consistent sequences with shorter trajectories
generate more accurate reconstructions. These results suggest that achieving
global consistency between relative camera poses and estimated depths with the
anatomy is essential. In doing so, we can ensure proper synergy between all
components of the pipeline for improved reconstructions that will facilitate
clinical application of this innovative technology
An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy
We present a comprehensive analysis of the submissions to the first edition of the Endoscopy Artefact Detection challenge (EAD). Using crowd-sourcing, this initiative is a step towards understanding the limitations of existing state-of-the-art computer vision methods applied to endoscopy and promoting the development of new approaches suitable for clinical translation. Endoscopy is a routine imaging technique for the detection, diagnosis and treatment of diseases in hollow-organs; the esophagus, stomach, colon, uterus and the bladder. However the nature of these organs prevent imaged tissues to be free of imaging artefacts such as bubbles, pixel saturation, organ specularity and debris, all of which pose substantial challenges for any quantitative analysis. Consequently, the potential for improved clinical outcomes through quantitative assessment of abnormal mucosal surface observed in endoscopy videos is presently not realized accurately. The EAD challenge promotes awareness of and addresses this key bottleneck problem by investigating methods that can accurately classify, localize and segment artefacts in endoscopy frames as critical prerequisite tasks. Using a diverse curated multi-institutional, multi-modality, multi-organ dataset of video frames, the accuracy and performance of 23 algorithms were objectively ranked for artefact detection and segmentation. The ability of methods to generalize to unseen datasets was also evaluated. The best performing methods (top 15%) propose deep learning strategies to reconcile variabilities in artefact appearance with respect to size, modality, occurrence and organ type. However, no single method outperformed across all tasks. Detailed analyses reveal the shortcomings of current training strategies and highlight the need for developing new optimal metrics to accurately quantify the clinical applicability of methods
Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy
The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core challenges often faced by endoscopists, mainly: 1) presence of multi-class artefacts that hinder their visual interpretation, and 2) difficulty in identifying subtle precancerous precursors and cancer abnormalities. Artefacts often affect the robustness of deep learning methods applied to the gastrointestinal tract organs as they can be confused with tissue of interest. EndoCV2020 challenges are designed to address research questions in these remits. In this paper, we present a summary of methods developed by the top 17 teams and provide an objective comparison of state-of-the-art methods and methods designed by the participants for two sub-challenges: i) artefact detection and segmentation (EAD2020), and ii) disease detection and segmentation (EDD2020). Multi-center, multi-organ, multi-class, and multi-modal clinical endoscopy datasets were compiled for both EAD2020 and EDD2020 sub-challenges. The out-of-sample generalization ability of detection algorithms was also evaluated. Whilst most teams focused on accuracy improvements, only a few methods hold credibility for clinical usability. The best performing teams provided solutions to tackle class imbalance, and variabilities in size, origin, modality and occurrences by exploring data augmentation, data fusion, and optimal class thresholding techniques. [Abstract copyright: Copyright © 2021 The Authors. Published by Elsevier B.V. All rights reserved.
An uncertainty-driven GCN refinement strategy for organ segmentation.
Organ segmentation in CT volumes is an important pre-processing step in many computerassisted intervention and diagnosis methods. In recent years, convolutional neural networkshave dominated the state of the art in this task. However, since this problem presents achallenging environment due to high variability in the organ’s shape and similarity betweentissues, the generation of false negative and false positive regions in the output segmentationis a common issue. Recent works have shown that the uncertainty analysis of the modelcan provide us with useful information about potential errors in the segmentation. In thiscontext, we proposed a segmentation refinement method based on uncertainty analysisand graph convolutional networks. We employ the uncertainty levels of the convolutionalnetwork in a particular input volume to formulate a semi-supervised graph learning problemthat is solved by training a graph convolutional network. To test our method we refine theinitial output of a 2D U-Net. We validate our framework with the NIH pancreas datasetand the spleen dataset of the medical segmentation decathlon. We show that our methodoutperforms the state-of-the-art CRF refinement method by improving the dice score by1% for the pancreas and 2% for spleen, with respect to the original U-Net’s prediction.Finally, we perform a sensitivity analysis on the parameters of our proposal and discussthe applicability to other CNN architectures, the results, and current limitations of themodel for future work in this research direction. For reproducibility purposes, we make ourcode publicly available athttps://github.com/rodsom22/gcn_refinement