8 research outputs found

    Reconstruction of Three-Dimensional Object from Two-Dimensional Images by Utilizing Distance Regularized Level Algorithm and Mesh Object Generation

    Get PDF
               ان عملية بناء نموذج ثلاثي الأبعاد من الصور هي طريقة مفيدة للغاية لتشكيل المجسمات باستخدام طريقة الصور الواقعية خاصة والتي يمكن ان تستخدم في العديد من المجالات مثل المجالات الصناعية والمجالات الطبية وغيرها. في المجال الطبي، ان تطبيق الماسح ثلاثي الأبعاد مثل الماسح الضوئي ثلاثي الأبعاد المستخدم لإعادة بناء الأنف الداخلي للجراحة التجميلية أو الماسح الضوئي ثلاثي الأبعاد المستخدم لإعادة بناء قناة الأذن لغرض تصنيع جهاز مساعد السمع، والتي تحتاج إلى دقة عالية في التفاصيل والقياس والذي يمثل مشكلة رئيسية يجب وضعها في الاعتبار عند تصميم أجهزة النمذجة ثلاثية الابعاد، كما ان الكلفة وإمكانية التنقل وسهولة الاستخدام هي القضية الثانية التي يجب أن تأخذها بعين الاعتبار. تم في هذا العمل اعداد نهج نظري وتجريبي لتصميم ماسح ثلاثي الابعاد للمجسمات ذو كلفة منخفضة. اقترحنا نظام إعادة بناء القناة ثلاثية الأبعاد (الأذن أو الأنف) استنادًا إلى الصور ثنائية الأبعاد المستخدمة لإعادة إنشاء كائن ثلاثي الأبعاد. تم استخدام منظار داخلي منخفض التكلفة مع برنامج مقترح يستند إلى خوارزمية تقسيم مجموعة المستوى المقياس المعياري من أجل إعادة بناء قناة الأذن أو الأنف من الصورة التي التقطتها المنظار الداخلي في الزمن الحقيقي. أظهرت النتائج دقة جيدة لتفاصيل القناة وقياساتها والتي تعود الى استخدام نجاح خوارزمية إعادة البناء وبرامج إعادة البناء المقترحة والتي انتجت مجسم شبكي ثلاثية الأبعاد جيدة للقناة.Three-dimensional (3D) reconstruction from images is a most beneficial method of object regeneration by using a photo-realistic way that can be used in many fields. For industrial fields, it can be used to visualize the cracks within alloys or walls. In medical fields, it has been used as 3D scanner to reconstruct some human organs such as internal nose for plastic surgery or to reconstruct ear canal for fabricating a hearing aid device, and others. These applications need high accuracy details and measurement that represent the main issue which should be taken in consideration, also the other issues are cost, movability, and ease of use which should be taken into consideration. This work has presented an approach for design and constructed a low-cost three-dimensional object scanner. We have proposed a 3D canal reconstruction system (ear or nose) based on using 2D images for reconstruction 3D object. A low-cost EndoScope with a proposed program based upon utilized the segmentation algorithm type “Distance Regularized Level” to segment active edges from images then generate mesh object in order to generate 3D structure for small canals or cracks. The results show good accuracy of the reconstructed object in both details and their measurements which are related to the success in the reconstruction of algorithm that yields good three-dimensional meshes object. &nbsp

    Towards Quantitative Endoscopy with Vision Intelligence

    Get PDF
    In this thesis, we work on topics related to quantitative endoscopy with vision-based intelligence. Specifically, our works revolve around the topic of video reconstruction in endoscopy, where many challenges exist, such as texture scarceness, illumination variation, multimodality, etc., and these prevent prior works from working effectively and robustly. To this end, we propose to combine the strength of expressivity of deep learning approaches and the rigorousness and accuracy of non-linear optimization algorithms to develop a series of methods to confront such challenges towards quantitative endoscopy. We first propose a retrospective sparse reconstruction method that can estimate a high-accuracy and density point cloud and high-completeness camera trajectory from a monocular endoscopic video with state-of-the-art performance. To enable this, replacing the role of a hand-crafted local descriptor, a deep image feature descriptor is developed to boost the feature matching performance in a typical sparse reconstruction algorithm. A retrospective surface reconstruction pipeline is then proposed to estimate a textured surface model from a monocular endoscopic video, where self-supervised depth and descriptor learning and surface fusion technique is involved. We show that the proposed method performs superior to a popular dense reconstruction method and the estimate reconstructions are in good agreement with the surface models obtained from CT scans. To align video-reconstructed surface models with pre-operative imaging such as CT, we introduce a global point cloud registration algorithm that is robust to resolution mismatch that often happens in such multi-modal scenarios. Specifically, a geometric feature descriptor is developed where a novel network normalization technique is used to help a 3D network produce more consistent and distinctive geometric features for samples with different resolutions. The proposed geometric descriptor achieves state-of-the-art performance, based on our evaluation. Last but not least, a real-time SLAM system that estimates a surface geometry and camera trajectory from a monocular endoscopic video is developed, where deep representations for geometry and appearance and non-linear factor graph optimization are used. We show that the proposed SLAM system performs favorably compared with a state-of-the-art feature-based SLAM system

    A comprehensive survey on recent deep learning-based methods applied to surgical data

    Full text link
    Minimally invasive surgery is highly operator dependant with a lengthy procedural time causing fatigue to surgeon and risks to patients such as injury to organs, infection, bleeding, and complications of anesthesia. To mitigate such risks, real-time systems are desired to be developed that can provide intra-operative guidance to surgeons. For example, an automated system for tool localization, tool (or tissue) tracking, and depth estimation can enable a clear understanding of surgical scenes preventing miscalculations during surgical procedures. In this work, we present a systematic review of recent machine learning-based approaches including surgical tool localization, segmentation, tracking, and 3D scene perception. Furthermore, we provide a detailed overview of publicly available benchmark datasets widely used for surgical navigation tasks. While recent deep learning architectures have shown promising results, there are still several open research problems such as a lack of annotated datasets, the presence of artifacts in surgical scenes, and non-textured surfaces that hinder 3D reconstruction of the anatomical structures. Based on our comprehensive review, we present a discussion on current gaps and needed steps to improve the adaptation of technology in surgery.Comment: This paper is to be submitted to International journal of computer visio

    Surgical Subtask Automation for Intraluminal Procedures using Deep Reinforcement Learning

    Get PDF
    Intraluminal procedures have opened up a new sub-field of minimally invasive surgery that use flexible instruments to navigate through complex luminal structures of the body, resulting in reduced invasiveness and improved patient benefits. One of the major challenges in this field is the accurate and precise control of the instrument inside the human body. Robotics has emerged as a promising solution to this problem. However, to achieve successful robotic intraluminal interventions, the control of the instrument needs to be automated to a large extent. The thesis first examines the state-of-the-art in intraluminal surgical robotics and identifies the key challenges in this field, which include the need for safe and effective tool manipulation, and the ability to adapt to unexpected changes in the luminal environment. To address these challenges, the thesis proposes several levels of autonomy that enable the robotic system to perform individual subtasks autonomously, while still allowing the surgeon to retain overall control of the procedure. The approach facilitates the development of specialized algorithms such as Deep Reinforcement Learning (DRL) for subtasks like navigation and tissue manipulation to produce robust surgical gestures. Additionally, the thesis proposes a safety framework that provides formal guarantees to prevent risky actions. The presented approaches are evaluated through a series of experiments using simulation and robotic platforms. The experiments demonstrate that subtask automation can improve the accuracy and efficiency of tool positioning and tissue manipulation, while also reducing the cognitive load on the surgeon. The results of this research have the potential to improve the reliability and safety of intraluminal surgical interventions, ultimately leading to better outcomes for patients and surgeons

    Scene Reconstruction Beyond Structure-from-Motion and Multi-View Stereo

    Get PDF
    Image-based 3D reconstruction has become a robust technology for recovering accurate and realistic models of real-world objects and scenes. A common pipeline for 3D reconstruction is to first apply Structure-from-Motion (SfM), which recovers relative poses for the input images and sparse geometry for the scene, and then apply Multi-view Stereo (MVS), which estimates a dense depthmap for each image. While this two-stage process is quite effective in many 3D modeling scenarios, there are limits to what can be reconstructed. This dissertation focuses on three particular scenarios where the SfM+MVS pipeline fails and introduces new approaches to accomplish each reconstruction task. First, I introduce a novel method to recover dense surface reconstructions of endoscopic video. In this setting, SfM can generally provide sparse surface structure, but the lack of surface texture as well as complex, changing illumination often causes MVS to fail. To overcome these difficulties, I introduce a method that utilizes SfM both to guide surface reflectance estimation and to regularize shading-based depth reconstruction. I also introduce models of reflectance and illumination that improve the final result. Second, I introduce an approach for augmenting 3D reconstructions from large-scale Internet photo-collections by recovering the 3D position of transient objects --- specifically, people --- in the input imagery. Since no two images can be assumed to capture the same person in the same location, the typical triangulation constraints enjoyed by SfM and MVS cannot be directly applied. I introduce an alternative method to approximately triangulate people who stood in similar locations, aided by a height distribution prior and visibility constraints provided by SfM. The scale of the scene, gravity direction, and per-person ground-surface normals are also recovered. Finally, I introduce the concept of using crowd-sourced imagery to create living 3D reconstructions --- visualizations of real places that include dynamic representations of transient objects. A key difficulty here is that SfM+MVS pipelines often poorly reconstruct ground surfaces given Internet images. To address this, I introduce a volumetric reconstruction approach that leverages scene scale and person placements. Crowd simulation is then employed to add virtual pedestrians to the space and bring the reconstruction "to life."Doctor of Philosoph

    Proceedings of the 2018 Canadian Society for Mechanical Engineering (CSME) International Congress

    Get PDF
    Published proceedings of the 2018 Canadian Society for Mechanical Engineering (CSME) International Congress, hosted by York University, 27-30 May 2018
    corecore