29 research outputs found

    A recurrent convolutional neural network approach for sensorless force estimation in robotic surgery

    Get PDF
    Providing force feedback as relevant information in current Robot-Assisted Minimally Invasive Surgery systems constitutes a technological challenge due to the constraints imposed by the surgical environment. In this context, force estimation techniques represent a potential solution, enabling to sense the interaction forces between the surgical instruments and soft-tissues. Specifically, if visual feedback is available for observing soft-tissues’ deformation, this feedback can be used to estimate the forces applied to these tissues. To this end, a force estimation model, based on Convolutional Neural Networks and Long-Short Term Memory networks, is proposed in this work. This model is designed to process both, the spatiotemporal information present in video sequences and the temporal structure of tool data (the surgical tool-tip trajectory and its grasping status). A series of analyses are carried out to reveal the advantages of the proposal and the challenges that remain for real applications. This research work focuses on two surgical task scenarios, referred to as pushing and pulling tissue. For these two scenarios, different input data modalities and their effect on the force estimation quality are investigated. These input data modalities are tool data, video sequences and a combination of both. The results suggest that the force estimation quality is better when both, the tool data and video sequences, are processed by the neural network model. Moreover, this study reveals the need for a loss function, designed to promote the modeling of smooth and sharp details found in force signals. Finally, the results show that the modeling of forces due to pulling tasks is more challenging than for the simplest pushing actions.Peer ReviewedPostprint (author's final draft

    DaFoEs:Mixing Datasets Towards the Generalization of Vision-State Deep-Learning Force Estimation in Minimally Invasive Robotic Surgery

    Get PDF
    Precisely determining the contact force during safe interaction in Minimally Invasive Robotic Surgery (MIRS) is still an open research challenge. Inspired by post-operative qualitative analysis from surgical videos, the use of cross-modality data driven deep neural network models has been one of the newest approaches to predict sensorless force trends. However, these methods required for large and variable datasets which are not currently available. In this paper, we present a new vision-haptic dataset (DaFoEs) with variable soft environments for the training of deep neural models. In order to reduce the bias from a single dataset, we present a pipeline to generalize different vision and state data inputs for mixed dataset training, using a previously validated dataset with different setup. Finally, we present a variable encoder-decoder architecture to predict the forces done by the laparoscopic tool using single input or sequence of inputs. For input sequence, we use a recurrent decoder, named with the prefix R, and a new temporal sampling to represent the acceleration of the tool. During our training, we demonstrate that single dataset training tends to overfit to the training data domain, but has difficulties on translating the results across new domains. However, dataset mixing presents a good translation with a mean relative estimated force error of 5% and 12% for the recurrent and non-recurrent models respectively. Our method, also marginally increase the effectiveness of transformers for force estimation up to a maximum of ≃15% , as the volume of available data is increase by 150% . In conclusion, we demonstrate that mixing experimental set ups for vision-state force estimation in MIRS is a possible approach towards the general solution of the problem

    DaFoEs: Mixing Datasets towards the generalization of vision-state deep-learning Force Estimation in Minimally Invasive Robotic Surgery

    Full text link
    Precisely determining the contact force during safe interaction in Minimally Invasive Robotic Surgery (MIRS) is still an open research challenge. Inspired by post-operative qualitative analysis from surgical videos, the use of cross-modality data driven deep neural network models has been one of the newest approaches to predict sensorless force trends. However, these methods required for large and variable datasets which are not currently available. In this paper, we present a new vision-haptic dataset (DaFoEs) with variable soft environments for the training of deep neural models. In order to reduce the bias from a single dataset, we present a pipeline to generalize different vision and state data inputs for mixed dataset training, using a previously validated dataset with different setup. Finally, we present a variable encoder-decoder architecture to predict the forces done by the laparoscopic tool using single input or sequence of inputs. For input sequence, we use a recurrent decoder, named with the prefix R, and a new temporal sampling to represent the acceleration of the tool. During our training, we demonstrate that single dataset training tends to overfit to the training data domain, but has difficulties on translating the results across new domains. However, dataset mixing presents a good translation with a mean relative estimated force error of 5% and 12% for the recurrent and non-recurrent models respectively. Our method, also marginally increase the effectiveness of transformers for force estimation up to a maximum of ~15%, as the volume of available data is increase by 150%. In conclusion, we demonstrate that mixing experimental set ups for vision-state force estimation in MIRS is a possible approach towards the general solution of the problem

    Toward Force Estimation in Robot-Assisted Surgery using Deep Learning with Vision and Robot State

    Full text link
    Knowledge of interaction forces during teleoperated robot-assisted surgery could be used to enable force feedback to human operators and evaluate tissue handling skill. However, direct force sensing at the end-effector is challenging because it requires biocompatible, sterilizable, and cost-effective sensors. Vision-based deep learning using convolutional neural networks is a promising approach for providing useful force estimates, though questions remain about generalization to new scenarios and real-time inference. We present a force estimation neural network that uses RGB images and robot state as inputs. Using a self-collected dataset, we compared the network to variants that included only a single input type, and evaluated how they generalized to new viewpoints, workspace positions, materials, and tools. We found that vision-based networks were sensitive to shifts in viewpoints, while state-only networks were robust to changes in workspace. The network with both state and vision inputs had the highest accuracy for an unseen tool, and was moderately robust to changes in viewpoints. Through feature removal studies, we found that using only position features produced better accuracy than using only force features as input. The network with both state and vision inputs outperformed a physics-based baseline model in accuracy. It showed comparable accuracy but faster computation times than a baseline recurrent neural network, making it better suited for real-time applications.Comment: 7 pages, 6 figures, submitted to ICRA 202

    Supervised Deep Learning with Finite Element Synthetic Data for Force Estimation in Robotic-assisted Surgery

    Get PDF
    The prevalence of robot-assisted minimally invasive surgery on the liver has increased exponentially. Having accurate, real-time knowledge of force during robotic-assisted surgical procedures is vital for safe surgery. Many techniques have been proposed in the literature to tackle this concern, from deploying force sensors to physics-based modeling of the robot and, more recently, learning-based force prediction. For a high-fidelity force measurement, sensors should be integrated at the instrument's tip, close to the surgical site, which brings sterilization, biocompatibility, and MRI compatibility concerns. On the other hand, Dynamic robot modeling may be precise in a specific setting, but it suffers from the lack of generalization encountering unseen settings. Considering the drawbacks and deficits of mentioned methods, indirect force estimation via deflection measurement through imaging techniques is investigated as an alternative solution, generally done via machine learning methods. Almost all previous studies are either supervised learning, where data are labeled with ex-vivo ground truth, or unsupervised or semi-supervised learning, where outcomes are promising but not adequate. This study investigated indirect force prediction for the human liver through a developed deep autoencoder model as a supervised deep learning method trained via synthetic data generated by finite element (FE) simulation. This method took advantage of various patient-specific livers parameters and geometries extracted from CT images. The Hyperelastic modeling of the soft tissue is considered and assessed with various hyperelastic models. The uncertainty due to the surgical tool's occlusion is addressed in this model, and a novel state vector was proposed to improve the accuracy and generalisability of the prediction. In addition, the impact of the bounded region on the model's accuracy was evaluated. It was shown that the proposed method could predict the external force on an unseen tissue with different geometry and mechanical properties. The accuracy of force prediction considering tool occlusion noise diminishes by 4.2 percent, which is in an acceptable range. The accuracy of presented model for various scenarios ranges from 95 to 88 percent. Model's results have been evaluated by predicting the force encountering the surface deformation of an unseen liver geometry and constitutive model where the mean absolute error of prediction is 0.249 Newton

    3 Dimensional Dense Reconstruction: A Review of Algorithms and Dataset

    Full text link
    3D dense reconstruction refers to the process of obtaining the complete shape and texture features of 3D objects from 2D planar images. 3D reconstruction is an important and extensively studied problem, but it is far from being solved. This work systematically introduces classical methods of 3D dense reconstruction based on geometric and optical models, as well as methods based on deep learning. It also introduces datasets for deep learning and the performance and advantages and disadvantages demonstrated by deep learning methods on these datasets.Comment: 16 page

    Deep Causal Learning for Robotic Intelligence

    Full text link
    This invited review discusses causal learning in the context of robotic intelligence. The paper introduced the psychological findings on causal learning in human cognition, then it introduced the traditional statistical solutions on causal discovery and causal inference. The paper reviewed recent deep causal learning algorithms with a focus on their architectures and the benefits of using deep nets and discussed the gap between deep causal learning and the needs of robotic intelligence
    corecore