11,598 research outputs found
Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination
We present a method for assessing skill from video, applicable to a variety
of tasks, ranging from surgery to drawing and rolling pizza dough. We formulate
the problem as pairwise (who's better?) and overall (who's best?) ranking of
video collections, using supervised deep ranking. We propose a novel loss
function that learns discriminative features when a pair of videos exhibit
variance in skill, and learns shared features when a pair of videos exhibit
comparable skill levels. Results demonstrate our method is applicable across
tasks, with the percentage of correctly ordered pairs of videos ranging from
70% to 83% for four datasets. We demonstrate the robustness of our approach via
sensitivity analysis of its parameters. We see this work as effort toward the
automated organization of how-to video collections and overall, generic skill
determination in video.Comment: CVPR 201
Deep learning for time series classification: a review
Time Series Classification (TSC) is an important and challenging problem in
data mining. With the increase of time series data availability, hundreds of
TSC algorithms have been proposed. Among these methods, only a few have
considered Deep Neural Networks (DNNs) to perform this task. This is surprising
as deep learning has seen very successful applications in the last years. DNNs
have indeed revolutionized the field of computer vision especially with the
advent of novel deeper architectures such as Residual and Convolutional Neural
Networks. Apart from images, sequential data such as text and audio can also be
processed with DNNs to reach state-of-the-art performance for document
classification and speech recognition. In this article, we study the current
state-of-the-art performance of deep learning algorithms for TSC by presenting
an empirical study of the most recent DNN architectures for TSC. We give an
overview of the most successful deep learning applications in various time
series domains under a unified taxonomy of DNNs for TSC. We also provide an
open source deep learning framework to the TSC community where we implemented
each of the compared approaches and evaluated them on a univariate TSC
benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By
training 8,730 deep learning models on 97 time series datasets, we propose the
most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover
Adversarial Attacks on Deep Neural Networks for Time Series Classification
Time Series Classification (TSC) problems are encountered in many real life
data mining tasks ranging from medicine and security to human activity
recognition and food safety. With the recent success of deep neural networks in
various domains such as computer vision and natural language processing,
researchers started adopting these techniques for solving time series data
mining problems. However, to the best of our knowledge, no previous work has
considered the vulnerability of deep learning models to adversarial time series
examples, which could potentially make them unreliable in situations where the
decision taken by the classifier is crucial such as in medicine and security.
For computer vision problems, such attacks have been shown to be very easy to
perform by altering the image and adding an imperceptible amount of noise to
trick the network into wrongly classifying the input image. Following this line
of work, we propose to leverage existing adversarial attack mechanisms to add a
special noise to the input time series in order to decrease the network's
confidence when classifying instances at test time. Our results reveal that
current state-of-the-art deep learning time series classifiers are vulnerable
to adversarial attacks which can have major consequences in multiple domains
such as food safety and quality assurance.Comment: Accepted at IJCNN 201
Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features
Recognizing the phases of a laparoscopic surgery (LS) operation form its
video constitutes a fundamental step for efficient content representation,
indexing and retrieval in surgical video databases. In the literature, most
techniques focus on phase segmentation of the entire LS video using
hand-crafted visual features, instrument usage signals, and recently
convolutional neural networks (CNNs). In this paper we address the problem of
phase recognition of short video shots (10s) of the operation, without
utilizing information about the preceding/forthcoming video frames, their phase
labels or the instruments used. We investigate four state-of-the-art CNN
architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature
extraction via transfer learning. Visual saliency was employed for selecting
the most informative region of the image as input to the CNN. Video shot
representation was based on two temporal pooling mechanisms. Most importantly,
we investigate the role of 'elapsed time' (from the beginning of the
operation), and we show that inclusion of this feature can increase performance
dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory
(LSTM) network was trained for video shot classification based on the fusion of
CNN features with 'elapsed time', increasing the accuracy to 86%. Our results
highlight the prominent role of visual saliency, long-range temporal recursion
and 'elapsed time' (a feature so far ignored), for surgical phase recognition.Comment: 6 pages, 4 figures, 6 table
Deepr: A Convolutional Net for Medical Records
Feature engineering remains a major bottleneck when creating predictive
systems from electronic medical records. At present, an important missing
element is detecting predictive regular clinical motifs from irregular episodic
records. We present Deepr (short for Deep record), a new end-to-end deep
learning system that learns to extract features from medical records and
predicts future risk automatically. Deepr transforms a record into a sequence
of discrete elements separated by coded time gaps and hospital transfers. On
top of the sequence is a convolutional neural net that detects and combines
predictive local clinical motifs to stratify the risk. Deepr permits
transparent inspection and visualization of its inner working. We validate
Deepr on hospital data to predict unplanned readmission after discharge. Deepr
achieves superior accuracy compared to traditional techniques, detects
meaningful clinical motifs, and uncovers the underlying structure of the
disease and intervention space
A Novel Deep Learning Framework for Internal Gross Target Volume Definition from 4D Computed Tomography of Lung Cancer Patients
In this paper, we study the reliability of a novel deep learning framework for internal gross target volume (IGTV) delineation from four-dimensional computed tomography (4DCT), which is applied to patients with lung cancer treated by Stereotactic Body Radiation Therapy (SBRT). 77 patients who underwent SBRT followed by 4DCT scans were incorporated in a retrospective study. The IGTV_DL was delineated using a novel deep machine learning algorithm with a linear exhaustive optimal combination framework, for the purpose of comparison, three other IGTVs base on common methods was also delineated, we compared the relative volume difference (RVI), matching index (MI) and encompassment index (EI) for the above IGTVs. Then, multiple parameter regression analysis assesses the tumor volume and motion range as clinical influencing factors in the MI variation. Experimental results demonstrated that the deep learning algorithm with linear exhaustive optimal combination framework has a higher probability of achieving optimal MI compared with other currently widely used methods. For patients after simple breathing training by keeping the respiratory frequency in 10 BMP, the four phase combinations of 0%, 30%, 50% and 90% can be considered as a potential candidate for an optimal combination to synthesis IGTV in all respiration amplitudes
- …