4 research outputs found

    Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm

    Full text link
    Functional endoscopic sinus surgery (FESS) is a surgical procedure used to treat acute cases of sinusitis and other sinus diseases. FESS is fast becoming the preferred choice of treatment due to its minimally invasive nature. However, due to the limited field of view of the endoscope, surgeons rely on navigation systems to guide them within the nasal cavity. State of the art navigation systems report registration accuracy of over 1mm, which is large compared to the size of the nasal airways. We present an anatomically constrained video-CT registration algorithm that incorporates multiple video features. Our algorithm is robust in the presence of outliers. We also test our algorithm on simulated and in-vivo data, and test its accuracy against degrading initializations.Comment: 8 pages, 4 figures, MICCA

    Deformable registration using shape statistics with applications in sinus surgery

    Get PDF
    Evaluating anatomical variations in structures like the nasal passage and sinuses is challenging because their complexity can often make it difficult to differentiate normal and abnormal anatomy. By statistically modeling these variations and estimating individual patient anatomy using these models, quantitative estimates of similarity or dissimilarity between the patient and the sample population can be made. In order to do this, a spatial alignment, or registration, between patient anatomy and the statistical model must first be computed. In this dissertation, a deformable most likely point paradigm is introduced that incorporates statistical variations into probabilistic feature-based registration algorithms. This paradigm is a variant of the most likely point paradigm, which incorporates feature uncertainty into the registration process. The deformable registration algorithms optimize the probability of feature alignment as well as the probability of model deformation allowing statistical models of anatomy to estimate, for instance, structures seen in endoscopic video without the need for patient specific computed tomography (CT) scans. The probabilistic framework also enables the algorithms to assess the quality of registrations produced, allowing users to know when an alignment can be trusted. This dissertation covers three algorithms built within this paradigm and evaluated in simulation and in-vivo experiments

    Endoscopic Motion Estimation using Video and CT

    Get PDF
    Functional Endoscopic Sinus Surgery (FESS) is a surgical procedure that otolaryngologists have adopted to treat sinus diseases. Aiming for accurate treatments and less complications, surgeons are usually guided with an endoscopic navigation system when performing the surgery. The state-of-the-art navigation systems report a submillimeter positioning error. This significantly reduces intraopertive time and improves surgical outcomes. Navigating endoscope is similar to Visual Odometry (VO) or Simultaneous Localization and Mapping (SLAM), all of which require an estimation of camera poses and motions in an unknown environment. Feature-based methods and direct methods are two common approaches for VO and Visual SLAM for motion estimation, but both methods have drawbacks. Feature computation and feature extraction consume are usually not computationally effective, while direct methods suffer from local optima. One recent alternative is called Semi-Direct Method, or hybrid method, which overcomes the drawbacks by applying optimization that is used in direct method to the selected features. In this work, we introduce a novel endoscopic navigation system for FESS which uses both prescanned CT model and 2D endoscope video. The system is able to texture map the CT model in real time for visualization and refine the pose estimation of the endoscope from different prior estimates

    Towards Quantitative Endoscopy with Vision Intelligence

    Get PDF
    In this thesis, we work on topics related to quantitative endoscopy with vision-based intelligence. Specifically, our works revolve around the topic of video reconstruction in endoscopy, where many challenges exist, such as texture scarceness, illumination variation, multimodality, etc., and these prevent prior works from working effectively and robustly. To this end, we propose to combine the strength of expressivity of deep learning approaches and the rigorousness and accuracy of non-linear optimization algorithms to develop a series of methods to confront such challenges towards quantitative endoscopy. We first propose a retrospective sparse reconstruction method that can estimate a high-accuracy and density point cloud and high-completeness camera trajectory from a monocular endoscopic video with state-of-the-art performance. To enable this, replacing the role of a hand-crafted local descriptor, a deep image feature descriptor is developed to boost the feature matching performance in a typical sparse reconstruction algorithm. A retrospective surface reconstruction pipeline is then proposed to estimate a textured surface model from a monocular endoscopic video, where self-supervised depth and descriptor learning and surface fusion technique is involved. We show that the proposed method performs superior to a popular dense reconstruction method and the estimate reconstructions are in good agreement with the surface models obtained from CT scans. To align video-reconstructed surface models with pre-operative imaging such as CT, we introduce a global point cloud registration algorithm that is robust to resolution mismatch that often happens in such multi-modal scenarios. Specifically, a geometric feature descriptor is developed where a novel network normalization technique is used to help a 3D network produce more consistent and distinctive geometric features for samples with different resolutions. The proposed geometric descriptor achieves state-of-the-art performance, based on our evaluation. Last but not least, a real-time SLAM system that estimates a surface geometry and camera trajectory from a monocular endoscopic video is developed, where deep representations for geometry and appearance and non-linear factor graph optimization are used. We show that the proposed SLAM system performs favorably compared with a state-of-the-art feature-based SLAM system
    corecore