6 research outputs found

    Recurrent Neural Networks for Representing, Segmenting, and Classifying Surgical Activities

    Get PDF
    Robot-assisted surgery has enabled scalable, transparent capture of high-quality data during operation, and this has in turn led to many new research opportunities. Among these opportunities are those that aim to improve the objectivity and efficiency of surgical training, which include making performance assessment and feedback more objective and consistent; providing more specific or localized assessment and feedback; delegating this responsibility to machines, which have the potential to provide feedback in any desired abundance; and having machines go even further, for example by optimizing practice routines, in the form of a virtual coach. In this thesis, we focus on a foundation that serves all of these objectives: automated surgical activity recognition, or in other words the ability to automatically determine what activities a surgeon is performing and when those activities are taking place. First, we introduce the use of recurrent neural networks (RNNs) for localizing and classifying surgical activities from motion data. Here, we show for the first time that this task is possible at the level of maneuvers, which unlike the activities considered in prior work are already a part of surgical training curricula. Second, we study the ability of RNNs to learn dependencies over extremely long time periods, which we posit are present in surgical motion data; and we introduce MIST RNNs, a new RNN architecture that is capable of capturing these extremely long-term dependencies. Third, we investigate unsupervised learning using surgical motion data: we show that predicting future motion from past motion with RNNs, using motion data alone, leads to meaningful and useful representations of surgical motion. This approach leads to the discovery of surgical activities from unannotated data, and to state-of-the-art performance for querying a database of surgical activity using motion-based queries. Finally, we depart from a common yet limiting assumption in nearly all prior work on surgical activity recognition: that annotated training data, which is difficult and expensive to acquire, is available in abundance. We demonstrate for the first time that both gesture recognition and maneuver recognition are feasible even when very few annotated sequences are available; and that future-prediction based representation learning, prior to the recognition phase, yields significant performance improvements when annotated data is scarce

    Automated Virtual Coach for Surgical Training

    Get PDF
    Surgical educators have recommended individualized coaching for acquisition, retention and improvement of expertise in technical skills. Such one-on-one coaching is limited to institutions that can afford surgical coaches and is certainly not feasible at national and global scales. We hypothesize that automated methods that model intraoperative video, surgeon's hand and instrument motion, and sensor data can provide effective and efficient individualized coaching. With the advent of instrumented operating rooms and training laboratories, access to such large scale intra-operative data has become feasible. Previous methods for automated skill assessment present an overall evaluation at the task/global level to the surgeons without any directed feedback and error analysis. Demonstration, if at all, is present in the form of fixed instructional videos, while deliberate practice is completely absent from automated training platforms. We believe that an effective coach should: demonstrate expert behavior (how do I do it correctly), evaluate trainee performance (how did I do) at task and segment-level, critique errors and deficits (where and why was I wrong), recommend deliberate practice (what do I do to improve), and monitor skill progress (when do I become proficient). In this thesis, we present new methods and solutions towards these coaching interventions in different training settings viz. virtual reality simulation, bench-top simulation and the operating room. First, we outline a summarizations-based approach for surgical phase modeling using various sources of intra-operative procedural data such as – system events (sensors) as well as crowdsourced surgical activity context. We validate a crowdsourced approach to obtain context summarizations of intra-operative surgical activity. Second, we develop a new scoring method to evaluate task segments using rankings derived from pairwise comparisons of performances obtained via crowdsourcing. We show that reliable and valid crowdsourced pairwise comparisons can be obtained across multiple training task settings. Additionally, we present preliminary results comparing inter-rater agreement in relative ratings and absolute ratings for crowdsourced assessments of an endoscopic sinus surgery training task data set. Third, we implement a real-time feedback and teaching framework using virtual reality simulation to present teaching cues and deficit metrics that are targeted at critical learning elements of a task. We compare the effectiveness of this real-time coach to independent self-driven learning on a needle passing task in a pilot randomized controlled trial. Finally, we present an integration of the above components of task progress detection, segment-level evaluation and real-time feedback towards the first end-to-end automated virtual coach for surgical training
    corecore