1,086 research outputs found
Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey
Deep learning has recently achieved very promising results in a wide range of
areas such as computer vision, speech recognition and natural language
processing. It aims to learn hierarchical representations of data by using deep
architecture models. In a smart city, a lot of data (e.g. videos captured from
many distributed sensors) need to be automatically processed and analyzed. In
this paper, we review the deep learning algorithms applied to video analytics
of smart city in terms of different research topics: object detection, object
tracking, face recognition, image classification and scene labeling.Comment: 8 pages, 18 figure
Deep Learning in Information Security
Machine learning has a long tradition of helping to solve complex information
security problems that are difficult to solve manually. Machine learning
techniques learn models from data representations to solve a task. These data
representations are hand-crafted by domain experts. Deep Learning is a
sub-field of machine learning, which uses models that are composed of multiple
layers. Consequently, representations that are used to solve a task are learned
from the data instead of being manually designed.
In this survey, we study the use of DL techniques within the domain of
information security. We systematically reviewed 77 papers and presented them
from a data-centric perspective. This data-centric perspective reflects one of
the most crucial advantages of DL techniques -- domain independence. If
DL-methods succeed to solve problems on a data type in one domain, they most
likely will also succeed on similar data from another domain. Other advantages
of DL methods are unrivaled scalability and efficiency, both regarding the
number of examples that can be analyzed as well as with respect of
dimensionality of the input data. DL methods generally are capable of achieving
high-performance and generalize well.
However, information security is a domain with unique requirements and
challenges. Based on an analysis of our reviewed papers, we point out
shortcomings of DL-methods to those requirements and discuss further research
opportunities
Identifying Synapses Using Deep and Wide Multiscale Recursive Networks
In this work, we propose a learning framework for identifying synapses using
a deep and wide multi-scale recursive (DAWMR) network, previously considered in
image segmentation applications. We apply this approach on electron microscopy
data from invertebrate fly brain tissue. By learning features directly from the
data, we are able to achieve considerable improvements over existing techniques
that rely on a small set of hand-designed features. We show that this system
can reduce the amount of manual annotation required, in both acquisition of
training data as well as verification of inferred detections
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
Learning Contextual Dependencies with Convolutional Hierarchical Recurrent Neural Networks
Existing deep convolutional neural networks (CNNs) have shown their great
success on image classification. CNNs mainly consist of convolutional and
pooling layers, both of which are performed on local image areas without
considering the dependencies among different image regions. However, such
dependencies are very important for generating explicit image representation.
In contrast, recurrent neural networks (RNNs) are well known for their ability
of encoding contextual information among sequential data, and they only require
a limited number of network parameters. General RNNs can hardly be directly
applied on non-sequential data. Thus, we proposed the hierarchical RNNs
(HRNNs). In HRNNs, each RNN layer focuses on modeling spatial dependencies
among image regions from the same scale but different locations. While the
cross RNN scale connections target on modeling scale dependencies among regions
from the same location but different scales. Specifically, we propose two
recurrent neural network models: 1) hierarchical simple recurrent network
(HSRN), which is fast and has low computational cost; and 2) hierarchical
long-short term memory recurrent network (HLSTM), which performs better than
HSRN with the price of more computational cost.
In this manuscript, we integrate CNNs with HRNNs, and develop end-to-end
convolutional hierarchical recurrent neural networks (C-HRNNs). C-HRNNs not
only make use of the representation power of CNNs, but also efficiently encodes
spatial and scale dependencies among different image regions. On four of the
most challenging object/scene image classification benchmarks, our C-HRNNs
achieve state-of-the-art results on Places 205, SUN 397, MIT indoor, and
competitive results on ILSVRC 2012
Transfer of View-manifold Learning to Similarity Perception of Novel Objects
We develop a model of perceptual similarity judgment based on re-training a
deep convolution neural network (DCNN) that learns to associate different views
of each 3D object to capture the notion of object persistence and continuity in
our visual experience. The re-training process effectively performs distance
metric learning under the object persistency constraints, to modify the
view-manifold of object representations. It reduces the effective distance
between the representations of different views of the same object without
compromising the distance between those of the views of different objects,
resulting in the untangling of the view-manifolds between individual objects
within the same category and across categories. This untangling enables the
model to discriminate and recognize objects within the same category,
independent of viewpoints. We found that this ability is not limited to the
trained objects, but transfers to novel objects in both trained and untrained
categories, as well as to a variety of completely novel artificial synthetic
objects. This transfer in learning suggests the modification of distance
metrics in view- manifolds is more general and abstract, likely at the levels
of parts, and independent of the specific objects or categories experienced
during training. Interestingly, the resulting transformation of feature
representation in the deep networks is found to significantly better match
human perceptual similarity judgment than AlexNet, suggesting that object
persistence could be an important constraint in the development of perceptual
similarity judgment in biological neural networks.Comment: Accepted to ICLR201
Deep Trans-layer Unsupervised Networks for Representation Learning
Learning features from massive unlabelled data is a vast prevalent topic for
high-level tasks in many machine learning applications. The recent great
improvements on benchmark data sets achieved by increasingly complex
unsupervised learning methods and deep learning models with lots of parameters
usually requires many tedious tricks and much expertise to tune. However,
filters learned by these complex architectures are quite similar to standard
hand-crafted features visually. In this paper, unsupervised learning methods,
such as PCA or auto-encoder, are employed as the building block to learn filter
banks at each layer. The lower layer responses are transferred to the last
layer (trans-layer) to form a more complete representation retaining more
information. In addition, some beneficial methods such as local contrast
normalization and whitening are added to the proposed deep trans-layer networks
to further boost performance. The trans-layer representations are followed by
block histograms with binary encoder schema to learn translation and rotation
invariant representations, which are utilized to do high-level tasks such as
recognition and classification. Compared to traditional deep learning methods,
the implemented feature learning method has much less parameters and is
validated in several typical experiments, such as digit recognition on MNIST
and MNIST variations, object recognition on Caltech 101 dataset and face
verification on LFW dataset. The deep trans-layer unsupervised learning
achieves 99.45% accuracy on MNIST dataset, 67.11% accuracy on 15 samples per
class and 75.98% accuracy on 30 samples per class on Caltech 101 dataset,
87.10% on LFW dataset.Comment: 21 pages, 3 figure
Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval
The goal of this work is the computation of very compact binary hashes for
image instance retrieval. Our approach has two novel contributions. The first
one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a
mathematical theory for computing group invariant transformations with
feed-forward neural networks. NIP is able to produce compact and
well-performing descriptors with visual representations extracted from
convolutional neural networks. We specifically incorporate scale, translation
and rotation invariances but the scheme can be extended to any arbitrary sets
of transformations. We also show that using moments of increasing order
throughout nesting is important. The NIP descriptors are then hashed to the
target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel
batch-level regularization scheme specifically designed for the purpose of
hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows
that the results obtained both with the NIP descriptors and the NIP+RBMH hashes
are consistently outstanding across a wide range of datasets.Comment: Image Instance Retrieval, CNN, Invariant Representation, Hashing,
Unsupervised Learning, Regularization. arXiv admin note: text overlap with
arXiv:1601.0209
PCANet: A Simple Deep Learning Baseline for Image Classification?
In this work, we propose a very simple deep learning network for image
classification which comprises only the very basic data processing components:
cascaded principal component analysis (PCA), binary hashing, and block-wise
histograms. In the proposed architecture, PCA is employed to learn multistage
filter banks. It is followed by simple binary hashing and block histograms for
indexing and pooling. This architecture is thus named as a PCA network (PCANet)
and can be designed and learned extremely easily and efficiently. For
comparison and better understanding, we also introduce and study two simple
variations to the PCANet, namely the RandNet and LDANet. They share the same
topology of PCANet but their cascaded filters are either selected randomly or
learned from LDA. We have tested these basic networks extensively on many
benchmark visual datasets for different tasks, such as LFW for face
verification, MultiPIE, Extended Yale B, AR, FERET datasets for face
recognition, as well as MNIST for hand-written digits recognition.
Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with
the state of the art features, either prefixed, highly hand-crafted or
carefully learned (by DNNs). Even more surprisingly, it sets new records for
many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST
variations. Additional experiments on other public datasets also demonstrate
the potential of the PCANet serving as a simple but highly competitive baseline
for texture classification and object recognition
Application of Deep Learning on Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations
We explore how Deep Learning (DL) can be utilized to predict prognosis of
acute myeloid leukemia (AML). Out of TCGA (The Cancer Genome Atlas) database,
94 AML cases are used in this study. Input data include age, 10 common
cytogenetic and 23 most common mutation results; output is the prognosis
(diagnosis to death, DTD). In our DL network, autoencoders are stacked to form
a hierarchical DL model from which raw data are compressed and organized and
high-level features are extracted. The network is written in R language and is
designed to predict prognosis of AML for a given case (DTD of more than or less
than 730 days). The DL network achieves an excellent accuracy of 83% in
predicting prognosis. As a proof-of-concept study, our preliminary results
demonstrate a practical application of DL in future practice of prognostic
prediction using next-gen sequencing (NGS) data.Comment: 11 pages, 1 table, 1 figure. arXiv admin note: substantial text
overlap with arXiv:1801.0101
- …