Search CORE

252 research outputs found

Evaluating surgical skills from kinematic data using convolutional neural networks

Author: B Bridgewater
G Islam
H Niitsu
L Maier-Hein
L Tao
MM Tedesco
N Ahmidi
R Hatala
SS Vedula
X Glorot
Y Kassahun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/06/2018
Field of study

The need for automatic surgical skills assessment is increasing, especially because manual feedback from senior surgeons observing junior surgeons is prone to subjectivity and time consuming. Thus, automating surgical skills evaluation is a very important step towards improving surgical practice. In this paper, we designed a Convolutional Neural Network (CNN) to evaluate surgeon skills by extracting patterns in the surgeon motions performed in robotic surgery. The proposed method is validated on the JIGSAWS dataset and achieved very competitive results with 100% accuracy on the suturing and needle passing tasks. While we leveraged from the CNNs efficiency, we also managed to mitigate its black-box effect using class activation map. This feature allows our method to automatically highlight which parts of the surgical task influenced the skill prediction and can be used to explain the classification and to provide personalized feedback to the trainee.Comment: Accepted at MICCAI 201

arXiv.org e-Print Archive

Crossref

SATR-DL: Improving Surgical Skill Assessment and Task Recognition in Robot-assisted Surgery with Deep Neural Networks

Author: Fey Ann Majewicz
Wang Ziheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/06/2018
Field of study

Purpose: This paper focuses on an automated analysis of surgical motion profiles for objective skill assessment and task recognition in robot-assisted surgery. Existing techniques heavily rely on conventional statistic measures or shallow modelings based on hand-engineered features and gesture segmentation. Such developments require significant expert knowledge, are prone to errors, and are less efficient in online adaptive training systems. Methods: In this work, we present an efficient analytic framework with a parallel deep learning architecture, SATR-DL, to assess trainee expertise and recognize surgical training activity. Through an end-to-end learning technique, abstract information of spatial representations and temporal dynamics is jointly obtained directly from raw motion sequences. Results: By leveraging a shared high-level representation learning, the resulting model is successful in the recognition of trainee skills and surgical tasks, suturing, needle-passing, and knot-tying. Meanwhile, we explore the use of ensemble in classification at the trial level, where the SATR-DL outperforms state-of-the-art performance by achieving accuracies of 0.960 and 1.000 in skill assessment and task recognition, respectively. Conclusion: This study highlights the potential of SATR-DL to provide improvements for an efficient data-driven assessment in intelligent robotic surgery

arXiv.org e-Print Archive

Crossref

Deep learning for time series classification: a review

Author: Fawaz Hassan Ismail
Forestier Germain
Idoumghar Lhassane
Muller Pierre-Alain
Weber Jonathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/05/2019
Field of study

Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover

arXiv.org e-Print Archive

univOAK

Deep learning for time series classification

Author: Fawaz Hassan Ismail
Publication venue
Publication date: 21/09/2020
Field of study

Time series analysis is a field of data science which is interested in analyzing sequences of numerical values ordered in time. Time series are particularly interesting because they allow us to visualize and understand the evolution of a process over time. Their analysis can reveal trends, relationships and similarities across the data. There exists numerous fields containing data in the form of time series: health care (electrocardiogram, blood sugar, etc.), activity recognition, remote sensing, finance (stock market price), industry (sensors), etc. Time series classification consists of constructing algorithms dedicated to automatically label time series data. The sequential aspect of time series data requires the development of algorithms that are able to harness this temporal property, thus making the existing off-the-shelf machine learning models for traditional tabular data suboptimal for solving the underlying task. In this context, deep learning has emerged in recent years as one of the most effective methods for tackling the supervised classification task, particularly in the field of computer vision. The main objective of this thesis was to study and develop deep neural networks specifically constructed for the classification of time series data. We thus carried out the first large scale experimental study allowing us to compare the existing deep methods and to position them compared other non-deep learning based state-of-the-art methods. Subsequently, we made numerous contributions in this area, notably in the context of transfer learning, data augmentation, ensembling and adversarial attacks. Finally, we have also proposed a novel architecture, based on the famous Inception network (Google), which ranks among the most efficient to date.Comment: PhD thesi

arXiv.org e-Print Archive

Theses.fr

The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos

Author: Damen Dima
Doughty Hazel
Mayol-Cuevas Walterio
Publication venue
Publication date: 11/03/2019
Field of study

We present a new model to determine relative skill from long videos, through learnable temporal attention modules. Skill determination is formulated as a ranking problem, making it suitable for common and generic tasks. However, for long videos, parts of the video are irrelevant for assessing skill, and there may be variability in the skill exhibited throughout a video. We therefore propose a method which assesses the relative overall level of skill in a long video by attending to its skill-relevant parts. Our approach trains temporal attention modules, learned with only video-level supervision, using a novel rank-aware loss function. In addition to attending to task relevant video parts, our proposed loss jointly trains two attention modules to separately attend to video parts which are indicative of higher (pros) and lower (cons) skill. We evaluate our approach on the EPIC-Skills dataset and additionally annotate a larger dataset from YouTube videos for skill determination with five previously unexplored tasks. Our method outperforms previous approaches and classic softmax attention on both datasets by over 4% pairwise accuracy, and as much as 12% on individual tasks. We also demonstrate our model's ability to attend to rank-aware parts of the video.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

Objective and automated assessment of surgical technical skills with IoT systems: A systematic literature review

Author: Alario-Hoyos Carlos
Castillo Segura Pablo
Delgado Kloos Carlos
Fernández Panadero María Carmen
Muñoz Merino Pedro José
Publication venue: 'Elsevier BV'
Publication date: 01/02/2021
Field of study

The assessment of surgical technical skills to be acquired by novice surgeons has been traditionally done by an expert surgeon and is therefore of a subjective nature. Nevertheless, the recent advances on IoT, the possibility of incorporating sensors into objects and environments in order to collect large amounts of data, and the progress on machine learning are facilitating a more objective and automated assessment of surgical technical skills. This paper presents a systematic literature review of papers published after 2013 discussing the objective and automated assessment of surgical technical skills. 101 out of an initial list of 537 papers were analyzed to identify: 1) the sensors used; 2) the data collected by these sensors and the relationship between these data, surgical technical skills and surgeons' levels of expertise; 3) the statistical methods and algorithms used to process these data; and 4) the feedback provided based on the outputs of these statistical methods and algorithms. Particularly, 1) mechanical and electromagnetic sensors are widely used for tool tracking, while inertial measurement units are widely used for body tracking; 2) path length, number of sub-movements, smoothness, fixation, saccade and total time are the main indicators obtained from raw data and serve to assess surgical technical skills such as economy, efficiency, hand tremor, or mind control, and distinguish between two or three levels of expertise (novice/intermediate/advanced surgeons); 3) SVM (Support Vector Machines) and Neural Networks are the preferred statistical methods and algorithms for processing the data collected, while new opportunities are opened up to combine various algorithms and use deep learning; and 4) feedback is provided by matching performance indicators and a lexicon of words and visualizations, although there is considerable room for research in the context of feedback and visualizations, taking, for example, ideas from learning analytics.This work was supported in part by the FEDER/Ministerio de Ciencia, Innovación y Universidades;Agencia Estatal de Investigación, through the Smartlet Project under Grant TIN2017-85179-C3-1-R, and in part by the Madrid Regional Government through the e-Madrid-CM Project under Grant S2018/TCS-4307, a project which is co-funded by the European Structural Funds (FSE and FEDER). Partial support has also been received from the European Commission through Erasmus + Capacity Building in the Field of Higher Education projects, more specifically through projects LALA (586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP), InnovaT (598758-EPP-1-2018-1-AT-EPPKA2-CBHE-JP), and PROF-XXI (609767-EPP-1-2019-1-ES-EPPKA2-CBHE-JP)

Universidad Carlos III de Madrid e-Archivo

A computational framework for unsupervised analysis of everyday human activities

Author: Hamid Muhammad Raffay
Publication venue: Georgia Institute of Technology
Publication date: 07/07/2008
Field of study

In order to make computers proactive and assistive, we must enable them to perceive, learn, and predict what is happening in their surroundings. This presents us with the challenge of formalizing computational models of everyday human activities. For a majority of environments, the structure of the in situ activities is generally not known a priori. This thesis therefore investigates knowledge representations and manipulation techniques that can facilitate learning of such everyday human activities in a minimally supervised manner. A key step towards this end is finding appropriate representations for human activities. We posit that if we chose to describe activities as finite sequences of an appropriate set of events, then the global structure of these activities can be uniquely encoded using their local event sub-sequences. With this perspective at hand, we particularly investigate representations that characterize activities in terms of their fixed and variable length event subsequences. We comparatively analyze these representations in terms of their representational scope, feature cardinality and noise sensitivity. Exploiting such representations, we propose a computational framework to discover the various activity-classes taking place in an environment. We model these activity-classes as maximally similar activity-cliques in a completely connected graph of activities, and describe how to discover them efficiently. Moreover, we propose methods for finding concise characterizations of these discovered activity-classes, both from a holistic as well as a by-parts perspective. Using such characterizations, we present an incremental method to classify a new activity instance to one of the discovered activity-classes, and to automatically detect if it is anomalous with respect to the general characteristics of its membership class. Our results show the efficacy of our framework in a variety of everyday environments.Ph.D.Committee Chair: Aaron Bobick; Committee Member: Charles Isbell; Committee Member: David Hogg; Committee Member: Irfan Essa; Committee Member: James Reh

Scholarly Materials And Research @ Georgia Tech