62,432 research outputs found
Evaluating Two-Stream CNN for Video Classification
Videos contain very rich semantic information. Traditional hand-crafted
features are known to be inadequate in analyzing complex video semantics.
Inspired by the huge success of the deep learning methods in analyzing image,
audio and text data, significant efforts are recently being devoted to the
design of deep nets for video analytics. Among the many practical needs,
classifying videos (or video clips) based on their major semantic categories
(e.g., "skiing") is useful in many applications. In this paper, we conduct an
in-depth study to investigate important implementation options that may affect
the performance of deep nets on video classification. Our evaluations are
conducted on top of a recent two-stream convolutional neural network (CNN)
pipeline, which uses both static frames and motion optical flows, and has
demonstrated competitive performance against the state-of-the-art methods. In
order to gain insights and to arrive at a practical guideline, many important
options are studied, including network architectures, model fusion, learning
parameters and the final prediction methods. Based on the evaluations, very
competitive results are attained on two popular video classification
benchmarks. We hope that the discussions and conclusions from this work can
help researchers in related fields to quickly set up a good basis for further
investigations along this very promising direction.Comment: ACM ICMR'1
Brain Tumor Segmentation with Deep Neural Networks
In this paper, we present a fully automatic brain tumor segmentation method
based on Deep Neural Networks (DNNs). The proposed networks are tailored to
glioblastomas (both low and high grade) pictured in MR images. By their very
nature, these tumors can appear anywhere in the brain and have almost any kind
of shape, size, and contrast. These reasons motivate our exploration of a
machine learning solution that exploits a flexible, high capacity DNN while
being extremely efficient. Here, we give a description of different model
choices that we've found to be necessary for obtaining competitive performance.
We explore in particular different architectures based on Convolutional Neural
Networks (CNN), i.e. DNNs specifically adapted to image data.
We present a novel CNN architecture which differs from those traditionally
used in computer vision. Our CNN exploits both local features as well as more
global contextual features simultaneously. Also, different from most
traditional uses of CNNs, our networks use a final layer that is a
convolutional implementation of a fully connected layer which allows a 40 fold
speed up. We also describe a 2-phase training procedure that allows us to
tackle difficulties related to the imbalance of tumor labels. Finally, we
explore a cascade architecture in which the output of a basic CNN is treated as
an additional source of information for a subsequent CNN. Results reported on
the 2013 BRATS test dataset reveal that our architecture improves over the
currently published state-of-the-art while being over 30 times faster
Joint Visual Denoising and Classification using Deep Learning
Visual restoration and recognition are traditionally addressed in pipeline
fashion, i.e. denoising followed by classification. Instead, observing
correlations between the two tasks, for example clearer image will lead to
better categorization and vice visa, we propose a joint framework for visual
restoration and recognition for handwritten images, inspired by advances in
deep autoencoder and multi-modality learning. Our model is a 3-pathway deep
architecture with a hidden-layer representation which is shared by multi-inputs
and outputs, and each branch can be composed of a multi-layer deep model. Thus,
visual restoration and classification can be unified using shared
representation via non-linear mapping, and model parameters can be learnt via
backpropagation. Using MNIST and USPS data corrupted with structured noise, the
proposed framework performs at least 20\% better in classification than
separate pipelines, as well as clearer recovered images. The noise model and
the reproducible source code is available at
{\url{https://github.com/ganggit/jointmodel}}.Comment: 5 pages, 7 figures, ICIP 201
- …