3,534 research outputs found

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding

    Full text link
    This work addresses the problem of semantic scene understanding under dense fog. Although considerable progress has been made in semantic scene understanding, it is mainly related to clear-weather scenes. Extending recognition methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation model from light synthetic fog to dense real fog in multiple steps, using both synthetic and real foggy data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) the Foggy Zurich dataset comprising 38083808 real foggy images, with pixel-level semantic annotations for 1616 images with dense fog. Our experiments show that 1) our fog simulation slightly outperforms a state-of-the-art competing simulation with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art models for SFSU significantly by leveraging unlabeled real foggy data. The datasets and code are publicly available.Comment: final version, ECCV 201

    How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change

    Full text link
    Direct visual localization has recently enjoyed a resurgence in popularity with the increasing availability of cheap mobile computing power. The competitive accuracy and robustness of these algorithms compared to state-of-the-art feature-based methods, as well as their natural ability to yield dense maps, makes them an appealing choice for a variety of mobile robotics applications. However, direct methods remain brittle in the face of appearance change due to their underlying assumption of photometric consistency, which is commonly violated in practice. In this paper, we propose to mitigate this problem by training deep convolutional encoder-decoder models to transform images of a scene such that they correspond to a previously-seen canonical appearance. We validate our method in multiple environments and illumination conditions using high-fidelity synthetic RGB-D datasets, and integrate the trained models into a direct visual localization pipeline, yielding improvements in visual odometry (VO) accuracy through time-varying illumination conditions, as well as improved metric relocalization performance under illumination change, where conventional methods normally fail. We further provide a preliminary investigation of transfer learning from synthetic to real environments in a localization context. An open-source implementation of our method using PyTorch is available at https://github.com/utiasSTARS/cat-net.Comment: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE International Conference on Robotics and Automation (ICRA'18), Brisbane, Australia, May 21-25, 201

    Visual Perception and Cognition in Image-Guided Intervention

    Get PDF
    Surgical image visualization and interaction systems can dramatically affect the efficacy and efficiency of surgical training, planning, and interventions. This is even more profound in the case of minimally-invasive surgery where restricted access to the operative field in conjunction with limited field of view necessitate a visualization medium to provide patient-specific information at any given moment. Unfortunately, little research has been devoted to studying human factors associated with medical image displays and the need for a robust, intuitive visualization and interaction interfaces has remained largely unfulfilled to this day. Failure to engineer efficient medical solutions and design intuitive visualization interfaces is argued to be one of the major barriers to the meaningful transfer of innovative technology to the operating room. This thesis was, therefore, motivated by the need to study various cognitive and perceptual aspects of human factors in surgical image visualization systems, to increase the efficiency and effectiveness of medical interfaces, and ultimately to improve patient outcomes. To this end, we chose four different minimally-invasive interventions in the realm of surgical training, planning, training for planning, and navigation: The first chapter involves the use of stereoendoscopes to reduce morbidity in endoscopic third ventriculostomy. The results of this study suggest that, compared with conventional endoscopes, the detection of the basilar artery on the surface of the third ventricle can be facilitated with the use of stereoendoscopes, increasing the safety of targeting in third ventriculostomy procedures. In the second chapter, a contour enhancement technique is described to improve preoperative planning of arteriovenous malformation interventions. The proposed method, particularly when combined with stereopsis, is shown to increase the speed and accuracy of understanding the spatial relationship between vascular structures. In the third chapter, an augmented-reality system is proposed to facilitate the training of planning brain tumour resection. The results of our user study indicate that the proposed system improves subjects\u27 performance, particularly novices\u27, in formulating the optimal point of entry and surgical path independent of the sensorimotor tasks performed. In the last chapter, the role of fully-immersive simulation environments on the surgeons\u27 non-technical skills to perform vertebroplasty procedure is investigated. Our results suggest that while training surgeons may increase their technical skills, the introduction of crisis scenarios significantly disturbs the performance, emphasizing the need of realistic simulation environments as part of training curriculum

    Design and evaluation of a haptically enable virtual environmentfor object assembly training

    Full text link

    Depth-Assisted Semantic Segmentation, Image Enhancement and Parametric Modeling

    Get PDF
    This dissertation addresses the problem of employing 3D depth information on solving a number of traditional challenging computer vision/graphics problems. Humans have the abilities of perceiving the depth information in 3D world, which enable humans to reconstruct layouts, recognize objects and understand the geometric space and semantic meanings of the visual world. Therefore it is significant to explore how the 3D depth information can be utilized by computer vision systems to mimic such abilities of humans. This dissertation aims at employing 3D depth information to solve vision/graphics problems in the following aspects: scene understanding, image enhancements and 3D reconstruction and modeling. In addressing scene understanding problem, we present a framework for semantic segmentation and object recognition on urban video sequence only using dense depth maps recovered from the video. Five view-independent 3D features that vary with object class are extracted from dense depth maps and used for segmenting and recognizing different object classes in street scene images. We demonstrate a scene parsing algorithm that uses only dense 3D depth information to outperform using sparse 3D or 2D appearance features. In addressing image enhancement problem, we present a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale internet photo collections (IPCs). By augmenting personal 2D images with 3D information reconstructed from IPCs, we address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms. In addressing 3D reconstruction and modeling problem, we focus on parametric modeling of flower petals, the most distinctive part of a plant. The complex structure, severe occlusions and wide variations make the reconstruction of their 3D models a challenging task. We overcome these challenges by combining data driven modeling techniques with domain knowledge from botany. Taking a 3D point cloud of an input flower scanned from a single view, each segmented petal is fitted with a scale-invariant morphable petal shape model, which is constructed from individually scanned 3D exemplar petals. Novel constraints based on botany studies are incorporated into the fitting process for realistically reconstructing occluded regions and maintaining correct 3D spatial relations. The main contribution of the dissertation is in the intelligent usage of 3D depth information on solving traditional challenging vision/graphics problems. By developing some advanced algorithms either automatically or with minimum user interaction, the goal of this dissertation is to demonstrate that computed 3D depth behind the multiple images contains rich information of the visual world and therefore can be intelligently utilized to recognize/ understand semantic meanings of scenes, efficiently enhance and augment single 2D images, and reconstruct high-quality 3D models

    Hand-eye calibration, constraints and source synchronisation for robotic-assisted minimally invasive surgery

    Get PDF
    In robotic-assisted minimally invasive surgery (RMIS), the robotic system allows surgeons to remotely control articulated instruments to perform surgical interventions and introduces a potential to implement computer-assisted interventions (CAI). However, the information in the camera must be correctly transformed into the robot coordinate as its movement is controlled by the robot kinematic. Therefore, determining the rigid transformation connecting the coordinates is necessary. Such process is called hand-eye calibration. One of the challenges in solving the hand-eye problem in the RMIS setup is data asynchronicity, which occurs when tracking equipments are integrated into a robotic system and create temporal misalignment. For the calibration itself, noise in the robot and camera motions can be propagated to the calibrated result and as a result of a limited motion range, the error cannot be fully suppressed. Finally, the calibration procedure must be adaptive and simple so a disruption in a surgical workflow is minimal since any change in the setup may require another calibration procedure. We propose solutions to deal with the asynchronicity, noise sensitivity, and a limited motion range. We also propose a potential to use a surgical instrument as the calibration target to reduce the complexity in the calibration procedure. The proposed algorithms are validated through extensive experiments with synthetic and real data from the da Vinci Research Kit and the KUKA robot arms. The calibration performance is compared with existing hand-eye algorithms and it shows promising results. Although the calibration using a surgical instrument as the calibration target still requires a further development, results indicate that the proposed methods increase the calibration performance, and contribute to finding an optimal solution to the hand-eye problem in robotic surgery

    GRACE: Online Gesture Recognition for Autonomous Camera-Motion Enhancement in Robot-Assisted Surgery

    Get PDF
    Camera navigation in minimally invasive surgery changed significantly since the introduction of robotic assistance. Robotic surgeons are subjected to a cognitive workload increase due to the asynchronous control over tools and camera, which also leads to interruptions in the workflow. Camera motion automation has been addressed as a possible solution, but still lacks situation awareness. We propose an online surgical Gesture Recognition for Autonomous Camera-motion Enhancement (GRACE) system to introduce situation awareness in autonomous camera navigation. A recurrent neural network is used in combination with a tool tracking system to offer gesture-specific camera motion during a robotic-assisted suturing task. GRACE was integrated with a research version of the da Vinci surgical system and a user study (involving 10 participants) was performed to evaluate the benefits introduced by situation awareness in camera motion, both with respect to a state of the art autonomous system (S) and current clinical approach (P). Results show GRACE improving completion time by a median reduction of 18.9s (8.1% ) with respect to S and 65.1s (21.1% ) with respect to P. Also, workload reduction was confirmed by statistical difference in the NASA Task Load Index with respect to S (p < 0.05). Reduction of motion sickness, a common issue related to continuous camera motion of autonomous systems, was assessed by a post-experiment survey ( p < 0.01 )
    corecore