904 research outputs found

    Gesture Recognition in Robotic Surgery: a Review

    Get PDF
    OBJECTIVE: Surgical activity recognition is a fundamental step in computer-assisted interventions. This paper reviews the state-of-the-art in methods for automatic recognition of fine-grained gestures in robotic surgery focusing on recent data-driven approaches and outlines the open questions and future research directions. METHODS: An article search was performed on 5 bibliographic databases with combinations of the following search terms: robotic, robot-assisted, JIGSAWS, surgery, surgical, gesture, fine-grained, surgeme, action, trajectory, segmentation, recognition, parsing. Selected articles were classified based on the level of supervision required for training and divided into different groups representing major frameworks for time series analysis and data modelling. RESULTS: A total of 52 articles were reviewed. The research field is showing rapid expansion, with the majority of articles published in the last 4 years. Deep-learning-based temporal models with discriminative feature extraction and multi-modal data integration have demonstrated promising results on small surgical datasets. Currently, unsupervised methods perform significantly less well than the supervised approaches. CONCLUSION: The development of large and diverse open-source datasets of annotated demonstrations is essential for development and validation of robust solutions for surgical gesture recognition. While new strategies for discriminative feature extraction and knowledge transfer, or unsupervised and semi-supervised approaches, can mitigate the need for data and labels, they have not yet been demonstrated to achieve comparable performance. Important future research directions include detection and forecast of gesture-specific errors and anomalies. SIGNIFICANCE: This paper is a comprehensive and structured analysis of surgical gesture recognition methods aiming to summarize the status of this rapidly evolving field

    ADVANCED MOTION MODELS FOR RIGID AND DEFORMABLE REGISTRATION IN IMAGE-GUIDED INTERVENTIONS

    Get PDF
    Image-guided surgery (IGS) has been a major area of interest in recent decades that continues to transform surgical interventions and enable safer, less invasive procedures. In the preoperative contexts, diagnostic imaging, including computed tomography (CT) and magnetic resonance (MR) imaging, offers a basis for surgical planning (e.g., definition of target, adjacent anatomy, and the surgical path or trajectory to the target). At the intraoperative stage, such preoperative images and the associated planning information are registered to intraoperative coordinates via a navigation system to enable visualization of (tracked) instrumentation relative to preoperative images. A major limitation to such an approach is that motions during surgery, either rigid motions of bones manipulated during orthopaedic surgery or brain soft-tissue deformation in neurosurgery, are not captured, diminishing the accuracy of navigation systems. This dissertation seeks to use intraoperative images (e.g., x-ray fluoroscopy and cone-beam CT) to provide more up-to-date anatomical context that properly reflects the state of the patient during interventions to improve the performance of IGS. Advanced motion models for inter-modality image registration are developed to improve the accuracy of both preoperative planning and intraoperative guidance for applications in orthopaedic pelvic trauma surgery and minimally invasive intracranial neurosurgery. Image registration algorithms are developed with increasing complexity of motion that can be accommodated (single-body rigid, multi-body rigid, and deformable) and increasing complexity of registration models (statistical models, physics-based models, and deep learning-based models). For orthopaedic pelvic trauma surgery, the dissertation includes work encompassing: (i) a series of statistical models to model shape and pose variations of one or more pelvic bones and an atlas of trajectory annotations; (ii) frameworks for automatic segmentation via registration of the statistical models to preoperative CT and planning of fixation trajectories and dislocation / fracture reduction; and (iii) 3D-2D guidance using intraoperative fluoroscopy. For intracranial neurosurgery, the dissertation includes three inter-modality deformable registrations using physic-based Demons and deep learning models for CT-guided and CBCT-guided procedures

    Computational Modeling Approaches For Task Analysis In Robotic-Assisted Surgery

    Get PDF
    Surgery is continuously subject to technological innovations including the introduction of robotic surgical devices. The ultimate goal is to program the surgical robot to perform certain difficult or complex surgical tasks in an autonomous manner. The feasibility of current robotic surgery systems to record quantitative motion and video data motivates developing descriptive mathematical models to recognize, classify and analyze surgical tasks. Recent advances in machine learning research for uncovering concealed patterns in huge data sets, like kinematic and video data, offer a possibility to better understand surgical procedures from a system point of view. This dissertation focuses on bridging the gap between these two lines of the research by developing computational models for task analysis in robotic-assisted surgery. The key step for advance study in robotic-assisted surgery and autonomous skill assessment is to develop techniques that are capable of recognizing fundamental surgical tasks intelligently. Surgical tasks and at a more granular level, surgical gestures, need to be quantified to make them amenable for further study. To answer to this query, we introduce a new framework, namely DTW-kNN, to recognize and classify three important surgical tasks including suturing, needle passing and knot tying based on kinematic data captured using da Vinci robotic surgery system. Our proposed method needs minimum preprocessing that results in simple, straightforward and accurate framework which can be applied for any autonomous control system. We also propose an unsupervised gesture segmentation and recognition (UGSR) method which has the ability to automatically segment and recognize temporal sequence of gestures in RMIS task. We also extent our model by applying soft boundary segmentation (Soft-UGSR) to address some of the challenges that exist in the surgical motion segmentation. The proposed algorithm can effectively model gradual transitions between surgical activities. Additionally, surgical training is undergoing a paradigm shift with more emphasis on the development of technical skills earlier in training. Thus metrics for the skills, especially objective metrics, become crucial. One field of surgery where such techniques can be developed is robotic surgery, as here all movements are already digitalized and therefore easily susceptible to analysis. Robotic surgery requires surgeons to perform a much longer and difficult training process which create numerous new challenges for surgical training. Hence, a new method of surgical skill assessment is required to ensure that surgeons have adequate skill level to be allowed to operate freely on patients. Among many possible approaches, those that provide noninvasive monitoring of expert surgeon and have the ability to automatically evaluate surgeon\u27s skill are of increased interest. Therefore, in this dissertation we develop a predictive framework for surgical skill assessment to automatically evaluate performance of surgeon in RMIS. Our classification framework is based on the Global Movement Features (GMFs) which extracted from kinematic movement data. The proposed method addresses some of the limitations in previous work and gives more insight about underlying patterns of surgical skill levels

    Deep learning in medical image registration: introduction and survey

    Full text link
    Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale. This document introduces image registration using a simple numeric example. It provides a definition of image registration along with a space-oriented symbolic representation. This review covers various aspects of image transformations, including affine, deformable, invertible, and bidirectional transformations, as well as medical image registration algorithms such as Voxelmorph, Demons, SyN, Iterative Closest Point, and SynthMorph. It also explores atlas-based registration and multistage image registration techniques, including coarse-fine and pyramid approaches. Furthermore, this survey paper discusses medical image registration taxonomies, datasets, evaluation measures, such as correlation-based metrics, segmentation-based metrics, processing time, and model size. It also explores applications in image-guided surgery, motion tracking, and tumor diagnosis. Finally, the document addresses future research directions, including the further development of transformers

    Crepuscular Rays for Tumor Accessibility Planning

    Get PDF

    Gesture Recognition in Robotic Surgery with Multimodal Attention

    Get PDF
    Automatically recognising surgical gestures from surgical data is an important building block of automated activity recognition and analytics, technical skill assessment, intra-operative assistance and eventually robotic automation. The complexity of articulated instrument trajectories and the inherent variability due to surgical style and patient anatomy make analysis and fine-grained segmentation of surgical motion patterns from robot kinematics alone very difficult. Surgical video provides crucial information from the surgical site with context for the kinematic data and the interaction between the instruments and tissue. Yet sensor fusion between the robot data and surgical video stream is non-trivial because the data have different frequency, dimensions and discriminative capability. In this paper, we integrate multimodal attention mechanisms in a two-stream temporal convolutional network to compute relevance scores and weight kinematic and visual feature representations dynamically in time, aiming to aid multimodal network training and achieve effective sensor fusion. We report the results of our system on the JIGSAWS benchmark dataset and on a new in vivo dataset of suturing segments from robotic prostatectomy procedures. Our results are promising and obtain multimodal prediction sequences with higher accuracy and better temporal structure than corresponding unimodal solutions. Visualization of attention scores also gives physically interpretable insights on network understanding of strengths and weaknesses of each sensor

    Toward Image-Guided Automated Suture Grasping Under Complex Environments: A Learning-Enabled and Optimization-Based Holistic Framework

    Get PDF
    To realize a higher-level autonomy of surgical knot tying in minimally invasive surgery (MIS), automated suture grasping, which bridges the suture stitching and looping procedures, is an important yet challenging task needs to be achieved. This paper presents a holistic framework with image-guided and automation techniques to robotize this operation even under complex environments. The whole task is initialized by suture segmentation, in which we propose a novel semi-supervised learning architecture featured with a suture-aware loss to pertinently learn its slender information using both annotated and unannotated data. With successful segmentation in stereo-camera, we develop a Sampling-based Sliding Pairing (SSP) algorithm to online optimize the suture's 3D shape. By jointly studying the robotic configuration and the suture's spatial characteristics, a target function is introduced to find the optimal grasping pose of the surgical tool with Remote Center of Motion (RCM) constraints. To compensate for inherent errors and practical uncertainties, a unified grasping strategy with a novel vision-based mechanism is introduced to autonomously accomplish this grasping task. Our framework is extensively evaluated from learning-based segmentation, 3D reconstruction, and image-guided grasping on the da Vinci Research Kit (dVRK) platform, where we achieve high performances and successful rates in perceptions and robotic manipulations. These results prove the feasibility of our approach in automating the suture grasping task, and this work fills the gap between automated surgical stitching and looping, stepping towards a higher-level of task autonomy in surgical knot tying
    • …
    corecore