38,598 research outputs found
Classification and Retrieval of Digital Pathology Scans: A New Dataset
In this paper, we introduce a new dataset, \textbf{Kimia Path24}, for image
classification and retrieval in digital pathology. We use the whole scan images
of 24 different tissue textures to generate 1,325 test patches of size
10001000 (0.5mm0.5mm). Training data can be generated according
to preferences of algorithm designer and can range from approximately 27,000 to
over 50,000 patches if the preset parameters are adopted. We propose a compound
patch-and-scan accuracy measurement that makes achieving high accuracies quite
challenging. In addition, we set the benchmarking line by applying LBP,
dictionary approach and convolutional neural nets (CNNs) and report their
results. The highest accuracy was 41.80\% for CNN.Comment: Accepted for presentation at Workshop for Computer Vision for
Microscopy Image Analysis (CVMI 2017) @ CVPR 2017, Honolulu, Hawai
Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features
Recognizing the phases of a laparoscopic surgery (LS) operation form its
video constitutes a fundamental step for efficient content representation,
indexing and retrieval in surgical video databases. In the literature, most
techniques focus on phase segmentation of the entire LS video using
hand-crafted visual features, instrument usage signals, and recently
convolutional neural networks (CNNs). In this paper we address the problem of
phase recognition of short video shots (10s) of the operation, without
utilizing information about the preceding/forthcoming video frames, their phase
labels or the instruments used. We investigate four state-of-the-art CNN
architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature
extraction via transfer learning. Visual saliency was employed for selecting
the most informative region of the image as input to the CNN. Video shot
representation was based on two temporal pooling mechanisms. Most importantly,
we investigate the role of 'elapsed time' (from the beginning of the
operation), and we show that inclusion of this feature can increase performance
dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory
(LSTM) network was trained for video shot classification based on the fusion of
CNN features with 'elapsed time', increasing the accuracy to 86%. Our results
highlight the prominent role of visual saliency, long-range temporal recursion
and 'elapsed time' (a feature so far ignored), for surgical phase recognition.Comment: 6 pages, 4 figures, 6 table
- …