252 research outputs found
Evaluating surgical skills from kinematic data using convolutional neural networks
The need for automatic surgical skills assessment is increasing, especially
because manual feedback from senior surgeons observing junior surgeons is prone
to subjectivity and time consuming. Thus, automating surgical skills evaluation
is a very important step towards improving surgical practice. In this paper, we
designed a Convolutional Neural Network (CNN) to evaluate surgeon skills by
extracting patterns in the surgeon motions performed in robotic surgery. The
proposed method is validated on the JIGSAWS dataset and achieved very
competitive results with 100% accuracy on the suturing and needle passing
tasks. While we leveraged from the CNNs efficiency, we also managed to mitigate
its black-box effect using class activation map. This feature allows our method
to automatically highlight which parts of the surgical task influenced the
skill prediction and can be used to explain the classification and to provide
personalized feedback to the trainee.Comment: Accepted at MICCAI 201
SATR-DL: Improving Surgical Skill Assessment and Task Recognition in Robot-assisted Surgery with Deep Neural Networks
Purpose: This paper focuses on an automated analysis of surgical motion
profiles for objective skill assessment and task recognition in robot-assisted
surgery. Existing techniques heavily rely on conventional statistic measures or
shallow modelings based on hand-engineered features and gesture segmentation.
Such developments require significant expert knowledge, are prone to errors,
and are less efficient in online adaptive training systems. Methods: In this
work, we present an efficient analytic framework with a parallel deep learning
architecture, SATR-DL, to assess trainee expertise and recognize surgical
training activity. Through an end-to-end learning technique, abstract
information of spatial representations and temporal dynamics is jointly
obtained directly from raw motion sequences. Results: By leveraging a shared
high-level representation learning, the resulting model is successful in the
recognition of trainee skills and surgical tasks, suturing, needle-passing, and
knot-tying. Meanwhile, we explore the use of ensemble in classification at the
trial level, where the SATR-DL outperforms state-of-the-art performance by
achieving accuracies of 0.960 and 1.000 in skill assessment and task
recognition, respectively. Conclusion: This study highlights the potential of
SATR-DL to provide improvements for an efficient data-driven assessment in
intelligent robotic surgery
Deep learning for time series classification: a review
Time Series Classification (TSC) is an important and challenging problem in
data mining. With the increase of time series data availability, hundreds of
TSC algorithms have been proposed. Among these methods, only a few have
considered Deep Neural Networks (DNNs) to perform this task. This is surprising
as deep learning has seen very successful applications in the last years. DNNs
have indeed revolutionized the field of computer vision especially with the
advent of novel deeper architectures such as Residual and Convolutional Neural
Networks. Apart from images, sequential data such as text and audio can also be
processed with DNNs to reach state-of-the-art performance for document
classification and speech recognition. In this article, we study the current
state-of-the-art performance of deep learning algorithms for TSC by presenting
an empirical study of the most recent DNN architectures for TSC. We give an
overview of the most successful deep learning applications in various time
series domains under a unified taxonomy of DNNs for TSC. We also provide an
open source deep learning framework to the TSC community where we implemented
each of the compared approaches and evaluated them on a univariate TSC
benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By
training 8,730 deep learning models on 97 time series datasets, we propose the
most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover
Deep learning for time series classification
Time series analysis is a field of data science which is interested in
analyzing sequences of numerical values ordered in time. Time series are
particularly interesting because they allow us to visualize and understand the
evolution of a process over time. Their analysis can reveal trends,
relationships and similarities across the data. There exists numerous fields
containing data in the form of time series: health care (electrocardiogram,
blood sugar, etc.), activity recognition, remote sensing, finance (stock market
price), industry (sensors), etc. Time series classification consists of
constructing algorithms dedicated to automatically label time series data. The
sequential aspect of time series data requires the development of algorithms
that are able to harness this temporal property, thus making the existing
off-the-shelf machine learning models for traditional tabular data suboptimal
for solving the underlying task. In this context, deep learning has emerged in
recent years as one of the most effective methods for tackling the supervised
classification task, particularly in the field of computer vision. The main
objective of this thesis was to study and develop deep neural networks
specifically constructed for the classification of time series data. We thus
carried out the first large scale experimental study allowing us to compare the
existing deep methods and to position them compared other non-deep learning
based state-of-the-art methods. Subsequently, we made numerous contributions in
this area, notably in the context of transfer learning, data augmentation,
ensembling and adversarial attacks. Finally, we have also proposed a novel
architecture, based on the famous Inception network (Google), which ranks among
the most efficient to date.Comment: PhD thesi
The Pros and Cons: Rank-aware Temporal Attention for Skill Determination in Long Videos
We present a new model to determine relative skill from long videos, through
learnable temporal attention modules. Skill determination is formulated as a
ranking problem, making it suitable for common and generic tasks. However, for
long videos, parts of the video are irrelevant for assessing skill, and there
may be variability in the skill exhibited throughout a video. We therefore
propose a method which assesses the relative overall level of skill in a long
video by attending to its skill-relevant parts. Our approach trains temporal
attention modules, learned with only video-level supervision, using a novel
rank-aware loss function. In addition to attending to task relevant video
parts, our proposed loss jointly trains two attention modules to separately
attend to video parts which are indicative of higher (pros) and lower (cons)
skill. We evaluate our approach on the EPIC-Skills dataset and additionally
annotate a larger dataset from YouTube videos for skill determination with five
previously unexplored tasks. Our method outperforms previous approaches and
classic softmax attention on both datasets by over 4% pairwise accuracy, and as
much as 12% on individual tasks. We also demonstrate our model's ability to
attend to rank-aware parts of the video.Comment: CVPR 201
Objective and automated assessment of surgical technical skills with IoT systems: A systematic literature review
The assessment of surgical technical skills to be acquired by novice surgeons has been traditionally done by an expert surgeon and is therefore of a subjective nature. Nevertheless, the recent advances on IoT, the possibility of incorporating sensors into objects and environments in order to collect large amounts of data, and the progress on machine learning are facilitating a more objective and automated assessment of surgical technical skills. This paper presents a systematic literature review of papers published after 2013 discussing the objective and automated assessment of surgical technical skills. 101 out of an initial list of 537 papers were analyzed to identify: 1) the sensors used; 2) the data collected by these sensors and the relationship between these data, surgical technical skills and surgeons' levels of expertise; 3) the statistical methods and algorithms used to process these data; and 4) the feedback provided based on the outputs of these statistical methods and algorithms. Particularly, 1) mechanical and electromagnetic sensors are widely used for tool tracking, while inertial measurement units are widely used for body tracking; 2) path length, number of sub-movements, smoothness, fixation, saccade and total time are the main indicators obtained from raw data and serve to assess surgical technical skills such as economy, efficiency, hand tremor, or mind control, and distinguish between two or three levels of expertise (novice/intermediate/advanced surgeons); 3) SVM (Support Vector Machines) and Neural Networks are the preferred statistical methods and algorithms for processing the data collected, while new opportunities are opened up to combine various algorithms and use deep learning; and 4) feedback is provided by matching performance indicators and a lexicon of words and visualizations, although there is considerable room for research in the context of feedback and visualizations, taking, for example, ideas from learning analytics.This work was supported in part by the FEDER/Ministerio de Ciencia, Innovación y Universidades;Agencia Estatal de Investigación, through the Smartlet Project under Grant TIN2017-85179-C3-1-R, and in part by the Madrid Regional Government through the e-Madrid-CM Project under Grant S2018/TCS-4307, a project which is co-funded by the European Structural Funds (FSE and FEDER). Partial support has also been received from the European Commission through Erasmus + Capacity Building in the Field of Higher Education projects, more specifically through projects LALA (586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP), InnovaT (598758-EPP-1-2018-1-AT-EPPKA2-CBHE-JP), and PROF-XXI (609767-EPP-1-2019-1-ES-EPPKA2-CBHE-JP)
A computational framework for unsupervised analysis of everyday human activities
In order to make computers proactive and assistive, we must enable them to perceive, learn, and predict what is happening in their surroundings. This presents us with the challenge of formalizing computational models of everyday human activities. For a majority of environments, the structure of the in situ activities is generally not known a priori. This thesis therefore investigates knowledge representations and manipulation techniques that can facilitate learning of such everyday human activities in a minimally supervised manner.
A key step towards this end is finding appropriate representations for human activities. We posit that if we chose to describe activities as finite sequences of an appropriate set of events, then the global structure of these activities can be uniquely encoded using their local event sub-sequences. With this perspective at hand, we particularly investigate representations that characterize activities in terms of their fixed and variable length event subsequences. We comparatively analyze these representations in terms of their representational scope, feature cardinality and noise sensitivity.
Exploiting such representations, we propose a computational framework to discover the various activity-classes taking place in an environment. We model these activity-classes as maximally similar activity-cliques in a completely connected graph of activities, and describe how to discover them efficiently. Moreover, we propose methods for finding concise characterizations of these discovered activity-classes, both from a holistic as well as a by-parts perspective. Using such characterizations, we present an incremental method to classify
a new activity instance to one of the discovered activity-classes, and to automatically detect if it is anomalous with respect to the general characteristics of its membership class. Our results show the efficacy of our framework in a variety of everyday environments.Ph.D.Committee Chair: Aaron Bobick; Committee Member: Charles Isbell; Committee Member: David Hogg; Committee Member: Irfan Essa; Committee Member: James Reh
- …