449 research outputs found

    Evaluating surgical skills from kinematic data using convolutional neural networks

    Full text link
    The need for automatic surgical skills assessment is increasing, especially because manual feedback from senior surgeons observing junior surgeons is prone to subjectivity and time consuming. Thus, automating surgical skills evaluation is a very important step towards improving surgical practice. In this paper, we designed a Convolutional Neural Network (CNN) to evaluate surgeon skills by extracting patterns in the surgeon motions performed in robotic surgery. The proposed method is validated on the JIGSAWS dataset and achieved very competitive results with 100% accuracy on the suturing and needle passing tasks. While we leveraged from the CNNs efficiency, we also managed to mitigate its black-box effect using class activation map. This feature allows our method to automatically highlight which parts of the surgical task influenced the skill prediction and can be used to explain the classification and to provide personalized feedback to the trainee.Comment: Accepted at MICCAI 201

    Estimating position & velocity in 3D space from monocular video sequences using a deep neural network

    Get PDF
    This work describes a regression model based on Convolutional Neural Networks (CNN) and Long-Short Term Memory (LSTM) networks for tracking objects from monocular video sequences. The target application being pursued is Vision-Based Sensor Substitution (VBSS). In particular, the tool-tip position and velocity in 3D space of a pair of surgical robotic instruments (SRI) are estimated for three surgical tasks, namely suturing, needle-passing and knot-tying. The CNN extracts features from individual video frames and the LSTM network processes these features over time and continuously outputs a 12-dimensional vector with the estimated position and velocity values. A series of analyses and experiments are carried out in the regression model to reveal the benefits and drawbacks of different design choices. First, the impact of the loss function is investigated by adequately weighing the Root Mean Squared Error (RMSE) and Gradient Difference Loss (GDL), using the VGG16 neural network for feature extraction. Second, this analysis is extended to a Residual Neural Network designed for feature extraction, which has fewer parameters than the VGG16 model, resulting in a reduction of ~96.44 % in the neural network size. Third, the impact of the number of time steps used to model the temporal information processed by the LSTM network is investigated. Finally, the capability of the regression model to generalize to the data related to "unseen" surgical tasks (unavailable in the training set) is evaluated. The aforesaid analyses are experimentally validated on the public dataset JIGSAWS. These analyses provide some guidelines for the design of a regression model in the context of VBSS, specifically when the objective is to estimate a set of 1D time series signals from video sequences.Peer ReviewedPostprint (author's final draft

    Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination

    Get PDF
    We present a method for assessing skill from video, applicable to a variety of tasks, ranging from surgery to drawing and rolling pizza dough. We formulate the problem as pairwise (who's better?) and overall (who's best?) ranking of video collections, using supervised deep ranking. We propose a novel loss function that learns discriminative features when a pair of videos exhibit variance in skill, and learns shared features when a pair of videos exhibit comparable skill levels. Results demonstrate our method is applicable across tasks, with the percentage of correctly ordered pairs of videos ranging from 70% to 83% for four datasets. We demonstrate the robustness of our approach via sensitivity analysis of its parameters. We see this work as effort toward the automated organization of how-to video collections and overall, generic skill determination in video.Comment: CVPR 201

    Deep learning for time series classification

    Full text link
    Time series analysis is a field of data science which is interested in analyzing sequences of numerical values ordered in time. Time series are particularly interesting because they allow us to visualize and understand the evolution of a process over time. Their analysis can reveal trends, relationships and similarities across the data. There exists numerous fields containing data in the form of time series: health care (electrocardiogram, blood sugar, etc.), activity recognition, remote sensing, finance (stock market price), industry (sensors), etc. Time series classification consists of constructing algorithms dedicated to automatically label time series data. The sequential aspect of time series data requires the development of algorithms that are able to harness this temporal property, thus making the existing off-the-shelf machine learning models for traditional tabular data suboptimal for solving the underlying task. In this context, deep learning has emerged in recent years as one of the most effective methods for tackling the supervised classification task, particularly in the field of computer vision. The main objective of this thesis was to study and develop deep neural networks specifically constructed for the classification of time series data. We thus carried out the first large scale experimental study allowing us to compare the existing deep methods and to position them compared other non-deep learning based state-of-the-art methods. Subsequently, we made numerous contributions in this area, notably in the context of transfer learning, data augmentation, ensembling and adversarial attacks. Finally, we have also proposed a novel architecture, based on the famous Inception network (Google), which ranks among the most efficient to date.Comment: PhD thesi

    Technical skill assessment in minimally invasive surgery using artificial intelligence: a systematic review.

    Get PDF
    BACKGROUND Technical skill assessment in surgery relies on expert opinion. Therefore, it is time-consuming, costly, and often lacks objectivity. Analysis of intraoperative data by artificial intelligence (AI) has the potential for automated technical skill assessment. The aim of this systematic review was to analyze the performance, external validity, and generalizability of AI models for technical skill assessment in minimally invasive surgery. METHODS A systematic search of Medline, Embase, Web of Science, and IEEE Xplore was performed to identify original articles reporting the use of AI in the assessment of technical skill in minimally invasive surgery. Risk of bias (RoB) and quality of theย included studies were analyzed according to Quality Assessment of Diagnostic Accuracy Studies criteria and the modified Joanna Briggs Institute checklists, respectively. Findings were reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement. RESULTS In total, 1958 articles were identified, 50 articles met eligibility criteria and were analyzed. Motion data extracted from surgical videos (nโ€‰=โ€‰25) or kinematic data from robotic systems or sensors (nโ€‰=โ€‰22) were the most frequent input data for AI. Most studies used deep learning (nโ€‰=โ€‰34) and predicted technical skills using an ordinal assessment scale (nโ€‰=โ€‰36) with good accuracies in simulated settings. However, all proposed models were in development stage, only 4 studies were externally validated and 8 showed a low RoB. CONCLUSION AI showed good performance in technical skill assessment in minimally invasive surgery. However, models often lacked external validity and generalizability. Therefore, models should be benchmarked using predefined performance metrics and tested in clinical implementation studies

    ์ž„์ƒ์ˆ ๊ธฐ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ• ์—ฐ๊ตฌ: ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ ์ง„๋‹จ ๋ฐ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ ๊ธฐ ํ‰๊ฐ€์— ์ ์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์˜์šฉ์ƒ์ฒด๊ณตํ•™์ „๊ณต, 2020. 8. ๊น€ํฌ์ฐฌ.This paper presents deep learning-based methods for improving performance of clinicians. Novel methods were applied to the following two clinical cases and the results were evaluated. In the first study, a deep learning-based polyp classification algorithm for improving clinical performance of endoscopist during colonoscopy diagnosis was developed. Colonoscopy is the main method for diagnosing adenomatous polyp, which can multiply into a colorectal cancer and hyperplastic polyps. The classification algorithm was developed using convolutional neural network (CNN), trained with colorectal polyp images taken by a narrow-band imaging colonoscopy. The proposed method is built around an automatic machine learning (AutoML) which searches for the optimal architecture of CNN for colorectal polyp image classification and trains the weights of the architecture. In addition, gradient-weighted class activation mapping technique was used to overlay the probabilistic basis of the prediction result on the polyp location to aid the endoscopists visually. To verify the improvement in diagnostic performance, the efficacy of endoscopists with varying proficiency levels were compared with or without the aid of the proposed polyp classification algorithm. The results confirmed that, on average, diagnostic accuracy was improved and diagnosis time was shortened in all proficiency groups significantly. In the second study, a surgical instruments tracking algorithm for robotic surgery video was developed, and a model for quantitatively evaluating the surgeons surgical skill based on the acquired motion information of the surgical instruments was proposed. The movement of surgical instruments is the main component of evaluation for surgical skill. Therefore, the focus of this study was develop an automatic surgical instruments tracking algorithm, and to overcome the limitations presented by previous methods. The instance segmentation framework was developed to solve the instrument occlusion issue, and a tracking framework composed of a tracker and a re-identification algorithm was developed to maintain the type of surgical instruments being tracked in the video. In addition, algorithms for detecting the tip position of instruments and arm-indicator were developed to acquire the movement of devices specialized for the robotic surgery video. The performance of the proposed method was evaluated by measuring the difference between the predicted tip position and the ground truth position of the instruments using root mean square error, area under the curve, and Pearsons correlation analysis. Furthermore, motion metrics were calculated from the movement of surgical instruments, and a machine learning-based robotic surgical skill evaluation model was developed based on these metrics. These models were used to evaluate clinicians, and results were similar in the developed evaluation models, the Objective Structured Assessment of Technical Skill (OSATS), and the Global Evaluative Assessment of Robotic Surgery (GEARS) evaluation methods. In this study, deep learning technology was applied to colorectal polyp images for a polyp classification, and to robotic surgery videos for surgical instruments tracking. The improvement in clinical performance with the aid of these methods were evaluated and verified.๋ณธ ๋…ผ๋ฌธ์€ ์˜๋ฃŒ์ง„์˜ ์ž„์ƒ์ˆ ๊ธฐ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•๋“ค์„ ์ œ์•ˆํ•˜๊ณ  ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ์‹ค๋ก€์— ๋Œ€ํ•ด ์ ์šฉํ•˜์—ฌ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ์œผ๋กœ ๊ด‘ํ•™ ์ง„๋‹จ ์‹œ, ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์šฉ์ข… ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜๊ณ , ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ ๋Šฅ๋ ฅ ํ–ฅ์ƒ ์—ฌ๋ถ€๋ฅผ ๊ฒ€์ฆํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ๋Œ€์žฅ๋‚ด์‹œ๊ฒฝ ๊ฒ€์‚ฌ๋กœ ์•”์ข…์œผ๋กœ ์ฆ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ์„ ์ข…๊ณผ ๊ณผ์ฆ์‹์„ฑ ์šฉ์ข…์„ ์ง„๋‹จํ•˜๋Š” ๊ฒƒ์€ ์ค‘์š”ํ•˜๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ˜‘๋Œ€์—ญ ์˜์ƒ ๋‚ด์‹œ๊ฒฝ์œผ๋กœ ์ดฌ์˜ํ•œ ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ํ•™์Šตํ•˜์—ฌ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ œ์•ˆํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ž๋™ ๊ธฐ๊ณ„ํ•™์Šต (AutoML) ๋ฐฉ๋ฒ•์œผ๋กœ, ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ์— ์ตœ์ ํ™”๋œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋ฅผ ์ฐพ๊ณ  ์‹ ๊ฒฝ๋ง์˜ ๊ฐ€์ค‘์น˜๋ฅผ ํ•™์Šตํ•˜์˜€๋‹ค. ๋˜ํ•œ ๊ธฐ์šธ๊ธฐ-๊ฐ€์ค‘์น˜ ํด๋ž˜์Šค ํ™œ์„ฑํ™” ๋งตํ•‘ ๊ธฐ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๊ฐœ๋ฐœํ•œ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง ๊ฒฐ๊ณผ์˜ ํ™•๋ฅ ์  ๊ทผ๊ฑฐ๋ฅผ ์šฉ์ข… ์œ„์น˜์— ์‹œ๊ฐ์ ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๋„๋ก ํ•จ์œผ๋กœ ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜์˜ ์ง„๋‹จ์„ ๋•๋„๋ก ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ˆ™๋ จ๋„ ๊ทธ๋ฃน๋ณ„๋กœ ๋‚ด์‹œ๊ฒฝ ์ „๋ฌธ์˜๊ฐ€ ์šฉ์ข… ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ฒฐ๊ณผ๋ฅผ ์ฐธ๊ณ ํ•˜์˜€์„ ๋•Œ ์ง„๋‹จ ๋Šฅ๋ ฅ์ด ํ–ฅ์ƒ๋˜์—ˆ๋Š”์ง€ ๋น„๊ต ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๊ณ , ๋ชจ๋“  ๊ทธ๋ฃน์—์„œ ์œ ์˜๋ฏธํ•˜๊ฒŒ ์ง„๋‹จ ์ •ํ™•๋„๊ฐ€ ํ–ฅ์ƒ๋˜๊ณ  ์ง„๋‹จ ์‹œ๊ฐ„์ด ๋‹จ์ถ•๋˜์—ˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์—์„œ ์ˆ˜์ˆ ๋„๊ตฌ ์œ„์น˜ ์ถ”์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜๊ณ , ํš๋“ํ•œ ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„ ์ •๋ณด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ˆ˜์ˆ ์ž์˜ ์ˆ™๋ จ๋„๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์€ ์ˆ˜์ˆ ์ž์˜ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ™๋ จ๋„๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ฃผ์š”ํ•œ ์ •๋ณด์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ์—ฐ๊ตฌ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ์ž๋™ ์ˆ˜์ˆ ๋„๊ตฌ ์ถ”์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€์œผ๋ฉฐ, ๋‹ค์Œ ๋‘๊ฐ€์ง€ ์„ ํ–‰์—ฐ๊ตฌ์˜ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜์˜€๋‹ค. ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  (Instance Segmentation) ํ”„๋ ˆ์ž„์›์„ ๊ฐœ๋ฐœํ•˜์—ฌ ํ์ƒ‰ (Occlusion) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๊ณ , ์ถ”์ ๊ธฐ (Tracker)์™€ ์žฌ์‹๋ณ„ํ™” (Re-Identification) ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ๊ตฌ์„ฑ๋œ ์ถ”์  ํ”„๋ ˆ์ž„์›์„ ๊ฐœ๋ฐœํ•˜์—ฌ ๋™์˜์ƒ์—์„œ ์ถ”์ ํ•˜๋Š” ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์ข…๋ฅ˜๊ฐ€ ์œ ์ง€๋˜๋„๋ก ํ•˜์˜€๋‹ค. ๋˜ํ•œ ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์˜ ํŠน์ˆ˜์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์„ ํš๋“ํ•˜๊ธฐ์œ„ํ•ด ์ˆ˜์ˆ ๋„๊ตฌ ๋ ์œ„์น˜์™€ ๋กœ๋ด‡ ํŒ”-์ธ๋””์ผ€์ดํ„ฐ (Arm-Indicator) ์ธ์‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ œ์•ˆํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์€ ์˜ˆ์ธกํ•œ ์ˆ˜์ˆ ๋„๊ตฌ ๋ ์œ„์น˜์™€ ์ •๋‹ต ์œ„์น˜ ๊ฐ„์˜ ํ‰๊ท  ์ œ๊ณฑ๊ทผ ์˜ค์ฐจ, ๊ณก์„  ์•„๋ž˜ ๋ฉด์ , ํ”ผ์–ด์Šจ ์ƒ๊ด€๋ถ„์„์œผ๋กœ ํ‰๊ฐ€ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ˆ˜์ˆ ๋„๊ตฌ์˜ ์›€์ง์ž„์œผ๋กœ๋ถ€ํ„ฐ ์›€์ง์ž„ ์ง€ํ‘œ๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ๋กœ๋ด‡์ˆ˜์ˆ  ์ˆ™๋ จ๋„ ํ‰๊ฐ€ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ๊ฐœ๋ฐœํ•œ ํ‰๊ฐ€ ๋ชจ๋ธ์€ ๊ธฐ์กด์˜ Objective Structured Assessment of Technical Skill (OSATS), Global Evaluative Assessment of Robotic Surgery (GEARS) ํ‰๊ฐ€ ๋ฐฉ๋ฒ•๊ณผ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์˜๋ฃŒ์ง„์˜ ์ž„์ƒ์ˆ ๊ธฐ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•˜์—ฌ ๋Œ€์žฅ ์šฉ์ข… ์˜์ƒ๊ณผ ๋กœ๋ด‡์ˆ˜์ˆ  ๋™์˜์ƒ์— ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์„ ์ ์šฉํ•˜๊ณ  ๊ทธ ์œ ํšจ์„ฑ์„ ํ™•์ธํ•˜์˜€์œผ๋ฉฐ, ํ–ฅํ›„์— ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์ž„์ƒ์—์„œ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ์ง„๋‹จ ๋ฐ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์˜ ๋Œ€์•ˆ์ด ๋  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.Chapter 1 General Introduction 1 1.1 Deep Learning for Medical Image Analysis 1 1.2 Deep Learning for Colonoscipic Diagnosis 2 1.3 Deep Learning for Robotic Surgical Skill Assessment 3 1.4 Thesis Objectives 5 Chapter 2 Optical Diagnosis of Colorectal Polyps using Deep Learning with Visual Explanations 7 2.1 Introduction 7 2.1.1 Background 7 2.1.2 Needs 8 2.1.3 Related Work 9 2.2 Methods 11 2.2.1 Study Design 11 2.2.2 Dataset 14 2.2.3 Preprocessing 17 2.2.4 Convolutional Neural Networks (CNN) 21 2.2.4.1 Standard CNN 21 2.2.4.2 Search for CNN Architecture 22 2.2.4.3 Searched CNN Training 23 2.2.4.4 Visual Explanation 24 2.2.5 Evaluation of CNN and Endoscopist Performances 25 2.3 Experiments and Results 27 2.3.1 CNN Performance 27 2.3.2 Results of Visual Explanation 31 2.3.3 Endoscopist with CNN Performance 33 2.4 Discussion 45 2.4.1 Research Significance 45 2.4.2 Limitations 47 2.5 Conclusion 49 Chapter 3 Surgical Skill Assessment during Robotic Surgery by Deep Learning-based Surgical Instrument Tracking 50 3.1 Introduction 50 3.1.1 Background 50 3.1.2 Needs 51 3.1.3 Related Work 52 3.2 Methods 56 3.2.1 Study Design 56 3.2.2 Dataset 59 3.2.3 Instance Segmentation Framework 63 3.2.4 Tracking Framework 66 3.2.4.1 Tracker 66 3.2.4.2 Re-identification 68 3.2.5 Surgical Instrument Tip Detection 69 3.2.6 Arm-Indicator Recognition 71 3.2.7 Surgical Skill Prediction Model 71 3.3 Experiments and Results 78 3.3.1 Performance of Instance Segmentation Framework 78 3.3.2 Performance of Tracking Framework 82 3.3.3 Evaluation of Surgical Instruments Trajectory 83 3.3.4 Evaluation of Surgical Skill Prediction Model 86 3.4 Discussion 90 3.4.1 Research Significance 90 3.4.2 Limitations 92 3.5 Conclusion 96 Chapter 4 Summary and Future Works 97 4.1 Thesis Summary 97 4.2 Limitations and Future Works 98 Bibliography 100 Abstract in Korean 116 Acknowledgement 119Docto

    One-shot domain adaptation in video-based assessment of surgical skills

    Full text link
    Deep Learning (DL) has achieved automatic and objective assessment of surgical skills. However, DL models are data-hungry and restricted to their training domain. This prevents them from transitioning to new tasks where data is limited. Hence, domain adaptation is crucial to implement DL in real life. Here, we propose a meta-learning model, A-VBANet, that can deliver domain-agnostic surgical skill classification via one-shot learning. We develop the A-VBANet on five laparoscopic and robotic surgical simulators. Additionally, we test it on operating room (OR) videos of laparoscopic cholecystectomy. Our model successfully adapts with accuracies up to 99.5% in one-shot and 99.9% in few-shot settings for simulated tasks and 89.7% for laparoscopic cholecystectomy. For the first time, we provide a domain-agnostic procedure for video-based assessment of surgical skills. A significant implication of this approach is that it allows the use of data from surgical simulators to assess performance in the operating room.Comment: 12 pages (+9 pages of Supplementary Materials), 4 figures (+2 Supplementary Figures), 2 tables (+5 Supplementary Tables
    • โ€ฆ
    corecore