7,799 research outputs found
Efficient video indexing for monitoring disease activity and progression in the upper gastrointestinal tract
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. While the endoscopy video contains a
wealth of information, tools to capture this information for the purpose of
clinical reporting are rather poor. In date, endoscopists do not have any
access to tools that enable them to browse the video data in an efficient and
user friendly manner. Fast and reliable video retrieval methods could for
example, allow them to review data from previous exams and therefore improve
their ability to monitor disease progression. Deep learning provides new
avenues of compressing and indexing video in an extremely efficient manner. In
this study, we propose to use an autoencoder for efficient video compression
and fast retrieval of video images. To boost the accuracy of video image
retrieval and to address data variability like multi-modality and view-point
changes, we propose the integration of a Siamese network. We demonstrate that
our approach is competitive in retrieving images from 3 large scale videos of 3
different patients obtained against the query samples of their previous
diagnosis. Quantitative validation shows that the combined approach yield an
overall improvement of 5% and 8% over classical and variational autoencoders,
respectively.Comment: Accepted at IEEE International Symposium on Biomedical Imaging
(ISBI), 201
Unifying and Merging Well-trained Deep Neural Networks for Inference Stage
We propose a novel method to merge convolutional neural-nets for the
inference stage. Given two well-trained networks that may have different
architectures that handle different tasks, our method aligns the layers of the
original networks and merges them into a unified model by sharing the
representative codes of weights. The shared weights are further re-trained to
fine-tune the performance of the merged model. The proposed method effectively
produces a compact model that may run original tasks simultaneously on
resource-limited devices. As it preserves the general architectures and
leverages the co-used weights of well-trained networks, a substantial training
overhead can be reduced to shorten the system development time. Experimental
results demonstrate a satisfactory performance and validate the effectiveness
of the method.Comment: To appear in the 27th International Joint Conference on Artificial
Intelligence and the 23rd European Conference on Artificial Intelligence,
2018. (IJCAI-ECAI 2018
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
EIE: Efficient Inference Engine on Compressed Deep Neural Network
State-of-the-art deep neural networks (DNNs) have hundreds of millions of
connections and are both computationally and memory intensive, making them
difficult to deploy on embedded systems with limited hardware resources and
power budgets. While custom hardware helps the computation, fetching weights
from DRAM is two orders of magnitude more expensive than ALU operations, and
dominates the required power.
Previously proposed 'Deep Compression' makes it possible to fit large DNNs
(AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by
pruning the redundant connections and having multiple connections share the
same weight. We propose an energy efficient inference engine (EIE) that
performs inference on this compressed network model and accelerates the
resulting sparse matrix-vector multiplication with weight sharing. Going from
DRAM to SRAM gives EIE 120x energy saving; Exploiting sparsity saves 10x;
Weight sharing gives 8x; Skipping zero activations from ReLU saves another 3x.
Evaluated on nine DNN benchmarks, EIE is 189x and 13x faster when compared to
CPU and GPU implementations of the same DNN without compression. EIE has a
processing power of 102GOPS/s working directly on a compressed network,
corresponding to 3TOPS/s on an uncompressed network, and processes FC layers of
AlexNet at 1.88x10^4 frames/sec with a power dissipation of only 600mW. It is
24,000x and 3,400x more energy efficient than a CPU and GPU respectively.
Compared with DaDianNao, EIE has 2.9x, 19x and 3x better throughput, energy
efficiency and area efficiency.Comment: External Links: TheNextPlatform: http://goo.gl/f7qX0L ; O'Reilly:
https://goo.gl/Id1HNT ; Hacker News: https://goo.gl/KM72SV ; Embedded-vision:
http://goo.gl/joQNg8 ; Talk at NVIDIA GTC'16: http://goo.gl/6wJYvn ; Talk at
Embedded Vision Summit: https://goo.gl/7abFNe ; Talk at Stanford University:
https://goo.gl/6lwuer. Published as a conference paper in ISCA 201
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
- …