969 research outputs found

    Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination

    Get PDF
    We present a method for assessing skill from video, applicable to a variety of tasks, ranging from surgery to drawing and rolling pizza dough. We formulate the problem as pairwise (who's better?) and overall (who's best?) ranking of video collections, using supervised deep ranking. We propose a novel loss function that learns discriminative features when a pair of videos exhibit variance in skill, and learns shared features when a pair of videos exhibit comparable skill levels. Results demonstrate our method is applicable across tasks, with the percentage of correctly ordered pairs of videos ranging from 70% to 83% for four datasets. We demonstrate the robustness of our approach via sensitivity analysis of its parameters. We see this work as effort toward the automated organization of how-to video collections and overall, generic skill determination in video.Comment: CVPR 201

    The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos

    Get PDF
    We present a new model to determine relative skill from long videos, through learnable temporal attention modules. Skill determination is formulated as a ranking problem, making it suitable for common and generic tasks. However, for long videos, parts of the video are irrelevant for assessing skill, and there may be variability in the skill exhibited throughout a video. We therefore propose a method which assesses the relative overall level of skill in a long video by attending to its skill-relevant parts. Our approach trains temporal attention modules, learned with only video-level supervision, using a novel rank-aware loss function. In addition to attending to task relevant video parts, our proposed loss jointly trains two attention modules to separately attend to video parts which are indicative of higher (pros) and lower (cons) skill. We evaluate our approach on the EPIC-Skills dataset and additionally annotate a larger dataset from YouTube videos for skill determination with five previously unexplored tasks. Our method outperforms previous approaches and classic softmax attention on both datasets by over 4% pairwise accuracy, and as much as 12% on individual tasks. We also demonstrate our model's ability to attend to rank-aware parts of the video.Comment: CVPR 201

    Advances in automated surgery skills evaluation

    Get PDF
    Training a surgeon to be skilled and competent to perform a given surgical procedure, is an important step in providing a high quality of care and reducing the risk of complications. Traditional surgical training is carried out by expert surgeons who observe and assess the trainees directly during a given procedure. However, these traditional training methods are time-consuming, subjective, costly, and do not offer an overall surgical expertise evaluation criterion. The solution for these subjective evaluation methods is a sensor-based methodology able to objectively assess the surgeon's skill level. The development and advances in sensor technologies enable capturing and studying the information obtained from complex surgery procedures. If the surgical activities that occur during a procedure are captured using a set of sensors, then the skill evaluation methodology can be defined as a motion and time series analysis problem. This work aims at developing machine learning approaches for automated surgical skill assessment based on hand motion analysis. Specifically, this work presents several contributions to the field of objective surgical techniques using multi-dimensional time series, such as 1) introduce a new distance measure for the surgical activities based on the alignment of two multi-dimensional time series, 2) develop an automated classification framework to identify the surgeon proficiency level using wrist worn sensors, 3) develop a classification technique to identify elementary surgical tasks: suturing, needle passing, and knot tying , 4) introduce a new surgemes mean feature reduction technique which help improve the machine learning algorithms, 5) develop a framework for surgical gesture classification by employing the mean feature reduction method, 6) design an unsupervised method to identify the surgemes in a given procedure.Includes bibliographical references

    Skill Determination from Long Videos

    Get PDF

    Computational Modeling Approaches For Task Analysis In Robotic-Assisted Surgery

    Get PDF
    Surgery is continuously subject to technological innovations including the introduction of robotic surgical devices. The ultimate goal is to program the surgical robot to perform certain difficult or complex surgical tasks in an autonomous manner. The feasibility of current robotic surgery systems to record quantitative motion and video data motivates developing descriptive mathematical models to recognize, classify and analyze surgical tasks. Recent advances in machine learning research for uncovering concealed patterns in huge data sets, like kinematic and video data, offer a possibility to better understand surgical procedures from a system point of view. This dissertation focuses on bridging the gap between these two lines of the research by developing computational models for task analysis in robotic-assisted surgery. The key step for advance study in robotic-assisted surgery and autonomous skill assessment is to develop techniques that are capable of recognizing fundamental surgical tasks intelligently. Surgical tasks and at a more granular level, surgical gestures, need to be quantified to make them amenable for further study. To answer to this query, we introduce a new framework, namely DTW-kNN, to recognize and classify three important surgical tasks including suturing, needle passing and knot tying based on kinematic data captured using da Vinci robotic surgery system. Our proposed method needs minimum preprocessing that results in simple, straightforward and accurate framework which can be applied for any autonomous control system. We also propose an unsupervised gesture segmentation and recognition (UGSR) method which has the ability to automatically segment and recognize temporal sequence of gestures in RMIS task. We also extent our model by applying soft boundary segmentation (Soft-UGSR) to address some of the challenges that exist in the surgical motion segmentation. The proposed algorithm can effectively model gradual transitions between surgical activities. Additionally, surgical training is undergoing a paradigm shift with more emphasis on the development of technical skills earlier in training. Thus metrics for the skills, especially objective metrics, become crucial. One field of surgery where such techniques can be developed is robotic surgery, as here all movements are already digitalized and therefore easily susceptible to analysis. Robotic surgery requires surgeons to perform a much longer and difficult training process which create numerous new challenges for surgical training. Hence, a new method of surgical skill assessment is required to ensure that surgeons have adequate skill level to be allowed to operate freely on patients. Among many possible approaches, those that provide noninvasive monitoring of expert surgeon and have the ability to automatically evaluate surgeon\u27s skill are of increased interest. Therefore, in this dissertation we develop a predictive framework for surgical skill assessment to automatically evaluate performance of surgeon in RMIS. Our classification framework is based on the Global Movement Features (GMFs) which extracted from kinematic movement data. The proposed method addresses some of the limitations in previous work and gives more insight about underlying patterns of surgical skill levels

    Automated Virtual Coach for Surgical Training

    Get PDF
    Surgical educators have recommended individualized coaching for acquisition, retention and improvement of expertise in technical skills. Such one-on-one coaching is limited to institutions that can afford surgical coaches and is certainly not feasible at national and global scales. We hypothesize that automated methods that model intraoperative video, surgeon's hand and instrument motion, and sensor data can provide effective and efficient individualized coaching. With the advent of instrumented operating rooms and training laboratories, access to such large scale intra-operative data has become feasible. Previous methods for automated skill assessment present an overall evaluation at the task/global level to the surgeons without any directed feedback and error analysis. Demonstration, if at all, is present in the form of fixed instructional videos, while deliberate practice is completely absent from automated training platforms. We believe that an effective coach should: demonstrate expert behavior (how do I do it correctly), evaluate trainee performance (how did I do) at task and segment-level, critique errors and deficits (where and why was I wrong), recommend deliberate practice (what do I do to improve), and monitor skill progress (when do I become proficient). In this thesis, we present new methods and solutions towards these coaching interventions in different training settings viz. virtual reality simulation, bench-top simulation and the operating room. First, we outline a summarizations-based approach for surgical phase modeling using various sources of intra-operative procedural data such as – system events (sensors) as well as crowdsourced surgical activity context. We validate a crowdsourced approach to obtain context summarizations of intra-operative surgical activity. Second, we develop a new scoring method to evaluate task segments using rankings derived from pairwise comparisons of performances obtained via crowdsourcing. We show that reliable and valid crowdsourced pairwise comparisons can be obtained across multiple training task settings. Additionally, we present preliminary results comparing inter-rater agreement in relative ratings and absolute ratings for crowdsourced assessments of an endoscopic sinus surgery training task data set. Third, we implement a real-time feedback and teaching framework using virtual reality simulation to present teaching cues and deficit metrics that are targeted at critical learning elements of a task. We compare the effectiveness of this real-time coach to independent self-driven learning on a needle passing task in a pilot randomized controlled trial. Finally, we present an integration of the above components of task progress detection, segment-level evaluation and real-time feedback towards the first end-to-end automated virtual coach for surgical training

    Gesture Recognition in Robotic Surgery: a Review

    Get PDF
    OBJECTIVE: Surgical activity recognition is a fundamental step in computer-assisted interventions. This paper reviews the state-of-the-art in methods for automatic recognition of fine-grained gestures in robotic surgery focusing on recent data-driven approaches and outlines the open questions and future research directions. METHODS: An article search was performed on 5 bibliographic databases with combinations of the following search terms: robotic, robot-assisted, JIGSAWS, surgery, surgical, gesture, fine-grained, surgeme, action, trajectory, segmentation, recognition, parsing. Selected articles were classified based on the level of supervision required for training and divided into different groups representing major frameworks for time series analysis and data modelling. RESULTS: A total of 52 articles were reviewed. The research field is showing rapid expansion, with the majority of articles published in the last 4 years. Deep-learning-based temporal models with discriminative feature extraction and multi-modal data integration have demonstrated promising results on small surgical datasets. Currently, unsupervised methods perform significantly less well than the supervised approaches. CONCLUSION: The development of large and diverse open-source datasets of annotated demonstrations is essential for development and validation of robust solutions for surgical gesture recognition. While new strategies for discriminative feature extraction and knowledge transfer, or unsupervised and semi-supervised approaches, can mitigate the need for data and labels, they have not yet been demonstrated to achieve comparable performance. Important future research directions include detection and forecast of gesture-specific errors and anomalies. SIGNIFICANCE: This paper is a comprehensive and structured analysis of surgical gesture recognition methods aiming to summarize the status of this rapidly evolving field
    • …
    corecore