13 research outputs found
daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery
This paper presents a technique to concurrently and jointly predict the
future trajectories of surgical instruments and the future state(s) of surgical
subtasks in robot-assisted surgeries (RAS) using multiple input sources. Such
predictions are a necessary first step towards shared control and supervised
autonomy of surgical subtasks. Minute-long surgical subtasks, such as suturing
or ultrasound scanning, often have distinguishable tool kinematics and visual
features, and can be described as a series of fine-grained states with
transition schematics. We propose daVinciNet - an end-to-end dual-task model
for robot motion and surgical state predictions. daVinciNet performs concurrent
end-effector trajectory and surgical state predictions using features extracted
from multiple data streams, including robot kinematics, endoscopic vision, and
system events. We evaluate our proposed model on an extended Robotic
Intra-Operative Ultrasound (RIOUS+) imaging dataset collected on a da Vinci Xi
surgical system and the JHU-ISI Gesture and Skill Assessment Working Set
(JIGSAWS). Our model achieves up to 93.85% short-term (0.5s) and 82.11%
long-term (2s) state prediction accuracy, as well as 1.07mm short-term and
5.62mm long-term trajectory prediction error.Comment: Accepted to IROS 202
daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery
This paper presents a technique to concurrently and jointly predict the future trajectories of surgical instruments and the future state(s) of surgical subtasks in robot-assisted surgeries (RAS) using multiple input sources. Such predictions are a necessary first step towards shared control and supervised autonomy of surgical subtasks. Minute-long surgical subtasks, such as suturing or ultrasound scanning, often have distinguishable tool kinematics and visual features, and can be described as a series of fine-grained states with transition schematics. We propose daVinciNet - an end-to-end dual-task model for robot motion and surgical state predictions. daVinciNet performs concurrent end-effector trajectory and surgical state predictions using features extracted from multiple data streams, including robot kinematics, endoscopic vision, and system events. We evaluate our proposed model on an extended Robotic Intra-Operative Ultrasound (RIOUS+) imaging dataset collected on a da Vinci Xi surgical system and the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS). Our model achieves up to 93.85% short-term (0.5s) and 82.11% long-term (2s) state prediction accuracy, as well as 1.07mm short-term and 5.62mm long-term trajectory prediction error
Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources
Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The objective of this work is to estimate the current state of the surgical task based on the actions performed or events occurred as the task progresses. We propose Fusion-KVE, a unified surgical state estimation model that incorporates multiple data sources including the Kinematics, Vision, and system Events. Additionally, we examine the strengths and weaknesses of different state estimation models in segmenting states with different representative features or levels of granularity. We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), as well as a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging, created using the da Vinci® Xi surgical system. Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models in both JIGSAWS suturing dataset and our RIOUS dataset
Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources
Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The objective of this work is to estimate the current state of the surgical task based on the actions performed or events occurred as the task progresses. We propose Fusion-KVE, a unified surgical state estimation model that incorporates multiple data sources including the Kinematics, Vision, and system Events. Additionally, we examine the strengths and weaknesses of different state estimation models in segmenting states with different representative features or levels of granularity. We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), as well as a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging, created using the da Vinci® Xi surgical system. Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models in both JIGSAWS suturing dataset and our RIOUS dataset
Recommended from our members
Robot Planning with Constrained Markov Decision Processes
Robotic technologies have advanced significantly that improved capabilities of robots. Such robots operate in complicated environments and are exposed to multiple resources of uncertainties. The uncertainties causes robots actions to be non-deterministic. Robot planning in non-deterministic environments is a challenging problem that has been extensively discussed in the literature. In this dissertation, we tackle this class of problems and are more particularly interested in finding an optimal solution while the robot faces several constraints. To do so, we leverage Constrained Markov Decision Processes (CMDPs)which are extensions to Markov Decision Processes (MDPs) by supporting multiple costs and constraints. Despite all the capabilities of CMDPs, they are not very popular in robot planning. One of our goals in this work is to show that CMDPs can also be used in robot planning. In the first part of this dissertation, we focus on optimizing CMDPs to solve large problems in a timely manner. Therefore, we propose a hierarchical approach that significantly reduces the computational time of solving a CMDP instance while preserving the existence of a valid solution. In other words, the Hierarchical CMDP (HCMDP) guarantees to find a valid solution for a specific problemif the non-hierarchical CMDP is able to find one. Although, the experimental evaluation shows that the HCMDP and the non-hierarchical CMDPgenerate comparable results, we do not provide any guarantees in terms of optimality. In the second part, we aim for more complicated constraints represented as tasks. Tasks are usually specified by Linear Temporal Logic (LTL) propertiesand determine a desired temporal sequence of states to be visited by the robot. For instance, an autonomous forklift may be tasked to go to a pick-up station, load an object, drive toward a delivery point and drop it off. As seen the order of states is critical. Thus, we propose a plannerthat finds a plan to satisfy multiple tasks with given probabilities while having various constraints on its cost functions.The proposed solver utilizes the theory of LTL properties to define tasks, and the theory of CMDPs to find an optimal solution. We also present a special form of product operation between LTL properties and CMDPs that is repeatable. This repeatability lets us apply the product operation several times to take all of the tasks into account. The proposed approach is extensively tested in Matlab, robot simulation and on a real robot.This solver runs the product operation many times which results in increasing number of states. Therefore, it is crucial to reduce the number of states in order to have a faster solver. In part three of this thesis, we aim for optimizing the solver in part two. We propose two improvements. The first improvement considers the order of product operations. Although the product operation is commutative and the order of operations does not influence the final result, it affects the computational time. Thus, we present an algorithm to find the best order of operations. The second improvement runs a pruning algorithm to reduce the number of states by removing the states that play little or no role in the final product. As opposing to the first improvement, it may change the final solution. However, we analyze different cases that may appear and show the effects