Search CORE

6 research outputs found

Recurrent Neural Networks for Representing, Segmenting, and Classifying Surgical Activities

Author: DiPietro Robert
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 21/06/2020
Field of study

Robot-assisted surgery has enabled scalable, transparent capture of high-quality data during operation, and this has in turn led to many new research opportunities. Among these opportunities are those that aim to improve the objectivity and efficiency of surgical training, which include making performance assessment and feedback more objective and consistent; providing more specific or localized assessment and feedback; delegating this responsibility to machines, which have the potential to provide feedback in any desired abundance; and having machines go even further, for example by optimizing practice routines, in the form of a virtual coach. In this thesis, we focus on a foundation that serves all of these objectives: automated surgical activity recognition, or in other words the ability to automatically determine what activities a surgeon is performing and when those activities are taking place. First, we introduce the use of recurrent neural networks (RNNs) for localizing and classifying surgical activities from motion data. Here, we show for the first time that this task is possible at the level of maneuvers, which unlike the activities considered in prior work are already a part of surgical training curricula. Second, we study the ability of RNNs to learn dependencies over extremely long time periods, which we posit are present in surgical motion data; and we introduce MIST RNNs, a new RNN architecture that is capable of capturing these extremely long-term dependencies. Third, we investigate unsupervised learning using surgical motion data: we show that predicting future motion from past motion with RNNs, using motion data alone, leads to meaningful and useful representations of surgical motion. This approach leads to the discovery of surgical activities from unannotated data, and to state-of-the-art performance for querying a database of surgical activity using motion-based queries. Finally, we depart from a common yet limiting assumption in nearly all prior work on surgical activity recognition: that annotated training data, which is difficult and expensive to acquire, is available in abundance. We demonstrate for the first time that both gesture recognition and maneuver recognition are feasible even when very few annotated sequences are available; and that future-prediction based representation learning, prior to the recognition phase, yields significant performance improvements when annotated data is scarce

Johns Hopkins University

JScholarship

Automated Virtual Coach for Surgical Training

Author: Malpani Anand
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 03/10/2018
Field of study

Surgical educators have recommended individualized coaching for acquisition, retention and improvement of expertise in technical skills. Such one-on-one coaching is limited to institutions that can afford surgical coaches and is certainly not feasible at national and global scales. We hypothesize that automated methods that model intraoperative video, surgeon's hand and instrument motion, and sensor data can provide effective and efficient individualized coaching. With the advent of instrumented operating rooms and training laboratories, access to such large scale intra-operative data has become feasible. Previous methods for automated skill assessment present an overall evaluation at the task/global level to the surgeons without any directed feedback and error analysis. Demonstration, if at all, is present in the form of fixed instructional videos, while deliberate practice is completely absent from automated training platforms. We believe that an effective coach should: demonstrate expert behavior (how do I do it correctly), evaluate trainee performance (how did I do) at task and segment-level, critique errors and deficits (where and why was I wrong), recommend deliberate practice (what do I do to improve), and monitor skill progress (when do I become proficient). In this thesis, we present new methods and solutions towards these coaching interventions in different training settings viz. virtual reality simulation, bench-top simulation and the operating room. First, we outline a summarizations-based approach for surgical phase modeling using various sources of intra-operative procedural data such as – system events (sensors) as well as crowdsourced surgical activity context. We validate a crowdsourced approach to obtain context summarizations of intra-operative surgical activity. Second, we develop a new scoring method to evaluate task segments using rankings derived from pairwise comparisons of performances obtained via crowdsourcing. We show that reliable and valid crowdsourced pairwise comparisons can be obtained across multiple training task settings. Additionally, we present preliminary results comparing inter-rater agreement in relative ratings and absolute ratings for crowdsourced assessments of an endoscopic sinus surgery training task data set. Third, we implement a real-time feedback and teaching framework using virtual reality simulation to present teaching cues and deficit metrics that are targeted at critical learning elements of a task. We compare the effectiveness of this real-time coach to independent self-driven learning on a needle passing task in a pilot randomized controlled trial. Finally, we present an integration of the above components of task progress detection, segment-level evaluation and real-time feedback towards the first end-to-end automated virtual coach for surgical training

JScholarship

Recommended from our members

Movement recognition from wearable sensors data: power-aware evolutionary training for template matching and data annotation recovery methods

Author: Ciliberto Mathias
Publication venue
Publication date: 18/11/2021
Field of study

Human activities recognition finds numerous applications for example in sport training, patient rehabilitation, gait analysis and surgical skills evaluation. Wearable sensing and Template Matching Methods (TMMs) offer significant advantages compared to manual assessment methods as well as to more cumbersome camera-based setups and other machine learning (ML) algorithms. TMMs require less data for training than other ML methods, they are low-power and therefore suitable for integration on wearable sensor. They compute a sample-by-sample distance between two time series. When applied to gestures sensors data, this even enables a richer and more movement-specific assessment and feedback. However, TMMs lack of a standard training procedure. In this thesis, we introduce an innovative evolutionary training algorithm for TMMthat not only can maximize recognition performance, but it can also prefer power-minimisation by reducing the TMM’s computational cost with a configurable trade-off. We exhibit that a reduction is even possible without sacrificing recognition performance by exploiting the long-established concept of “time warping”. We demonstrate that our method is suitable for a wide variety of raw data as well as processed, fused and encoded sensor data. We present a new original multi-modal, multi-user dataset of beach volleyball movements that allowed to evaluate our training methods on a real-case of sport training actions. Moreover, the collection of this dataset helped to generate a set of guidelines for the collection of movement data in the wild, using wearable sensors. We introduce a 3D human model that can be animated through inertial wearable sensors data for troubleshooting, movement analysis and privacy-safe annotation of human activities. Finally, through a case study on a dataset of drinking actions, we demonstrate how TMM can improve the quality of a badly annotated but highly valuable dataset

Sussex Research Online