4,051 research outputs found
Gated networks: an inventory
Gated networks are networks that contain gating connections, in which the
outputs of at least two neurons are multiplied. Initially, gated networks were
used to learn relationships between two input sources, such as pixels from two
images. More recently, they have been applied to learning activity recognition
or multi-modal representations. The aims of this paper are threefold: 1) to
explain the basic computations in gated networks to the non-expert, while
adopting a standpoint that insists on their symmetric nature. 2) to serve as a
quick reference guide to the recent literature, by providing an inventory of
applications of these networks, as well as recent extensions to the basic
architecture. 3) to suggest future research directions and applications.Comment: Unpublished manuscript, 17 page
GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval
The huge variance of human pose and the misalignment of detected human images
significantly increase the difficulty of person Re-Identification (Re-ID).
Moreover, efficient Re-ID systems are required to cope with the massive visual
data being produced by video surveillance systems. Targeting to solve these
problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an
efficient indexing and retrieval framework, respectively. GLAD explicitly
leverages the local and global cues in human body to generate a discriminative
and robust representation. It consists of part extraction and descriptor
learning modules, where several part regions are first detected and then deep
neural networks are designed for representation learning on both the local and
global regions. A hierarchical indexing and retrieval framework is designed to
eliminate the huge redundancy in the gallery set, and accelerate the online
Re-ID procedure. Extensive experimental results show GLAD achieves competitive
accuracy compared to the state-of-the-art methods. Our retrieval framework
significantly accelerates the online Re-ID procedure without loss of accuracy.
Therefore, this work has potential to work better on person Re-ID tasks in real
scenarios.Comment: Accepted by ACM MM2017, 9 pages, 5 figure
Towards a Spectrum of Graph Convolutional Networks
We present our ongoing work on understanding the limitations of graph
convolutional networks (GCNs) as well as our work on generalizations of graph
convolutions for representing more complex node attribute dependencies. Based
on an analysis of GCNs with the help of the corresponding computation graphs,
we propose a generalization of existing GCNs where the aggregation operations
are (a) determined by structural properties of the local neighborhood graphs
and (b) not restricted to weighted averages. We show that the proposed approach
is strictly more expressive while requiring only a modest increase in the
number of parameters and computations. We also show that the proposed
generalization is identical to standard convolutional layers when applied to
regular grid graphs
A Comprehensive Survey on Graph Neural Networks
Deep learning has revolutionized many machine learning tasks in recent years,
ranging from image classification and video processing to speech recognition
and natural language understanding. The data in these tasks are typically
represented in the Euclidean space. However, there is an increasing number of
applications where data are generated from non-Euclidean domains and are
represented as graphs with complex relationships and interdependency between
objects. The complexity of graph data has imposed significant challenges on
existing machine learning algorithms. Recently, many studies on extending deep
learning approaches for graph data have emerged. In this survey, we provide a
comprehensive overview of graph neural networks (GNNs) in data mining and
machine learning fields. We propose a new taxonomy to divide the
state-of-the-art graph neural networks into four categories, namely recurrent
graph neural networks, convolutional graph neural networks, graph autoencoders,
and spatial-temporal graph neural networks. We further discuss the applications
of graph neural networks across various domains and summarize the open source
codes, benchmark data sets, and model evaluation of graph neural networks.
Finally, we propose potential research directions in this rapidly growing
field.Comment: Minor revision (updated tables and references
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis
The past decade has seen an explosion in the amount of digital information
stored in electronic health records (EHR). While primarily designed for
archiving patient clinical information and administrative healthcare tasks,
many researchers have found secondary use of these records for various clinical
informatics tasks. Over the same period, the machine learning community has
seen widespread advances in deep learning techniques, which also have been
successfully applied to the vast amount of EHR data. In this paper, we review
these deep EHR systems, examining architectures, technical aspects, and
clinical applications. We also identify shortcomings of current techniques and
discuss avenues of future research for EHR-based deep learning.Comment: Accepted for publication with Journal of Biomedical and Health
Informatics: http://ieeexplore.ieee.org/abstract/document/8086133
Sound Event Detection with Sequentially Labelled Data Based on Connectionist Temporal Classification and Unsupervised Clustering
Sound event detection (SED) methods typically rely on either strongly
labelled data or weakly labelled data. As an alternative, sequentially labelled
data (SLD) was proposed. In SLD, the events and the order of events in audio
clips are known, without knowing the occurrence time of events. This paper
proposes a connectionist temporal classification (CTC) based SED system that
uses SLD instead of strongly labelled data, with a novel unsupervised
clustering stage. Experiments on 41 classes of sound events show that the
proposed two-stage method trained on SLD achieves performance comparable to the
previous state-of-the-art SED system trained on strongly labelled data, and is
far better than another state-of-the-art SED system trained on weakly labelled
data, which indicates the effectiveness of the proposed two-stage method
trained on SLD without any onset/offset time of sound events
Deep Learning on Graphs: A Survey
Deep learning has been shown to be successful in a number of domains, ranging
from acoustics, images, to natural language processing. However, applying deep
learning to the ubiquitous graph data is non-trivial because of the unique
characteristics of graphs. Recently, substantial research efforts have been
devoted to applying deep learning methods to graphs, resulting in beneficial
advances in graph analysis techniques. In this survey, we comprehensively
review the different types of deep learning methods on graphs. We divide the
existing methods into five categories based on their model architectures and
training strategies: graph recurrent neural networks, graph convolutional
networks, graph autoencoders, graph reinforcement learning, and graph
adversarial methods. We then provide a comprehensive overview of these methods
in a systematic manner mainly by following their development history. We also
analyze the differences and compositions of different methods. Finally, we
briefly outline the applications in which they have been used and discuss
potential future research directions.Comment: Accepted by Transactions on Knowledge and Data Engineering. 24 pages,
11 figure
Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
In this paper we propose a new method of speaker diarization that employs a
deep learning architecture to learn speaker embeddings. In contrast to the
traditional approaches that build their speaker embeddings using manually
hand-crafted spectral features, we propose to train for this purpose a
recurrent convolutional neural network applied directly on magnitude
spectrograms. To compare our approach with the state of the art, we collect and
release for the public an additional dataset of over 6 hours of fully annotated
broadcast material. The results of our evaluation on the new dataset and three
other benchmark datasets show that our proposed method significantly
outperforms the competitors and reduces diarization error rate by a large
margin of over 30% with respect to the baseline
A Survey on Methods and Theories of Quantized Neural Networks
Deep neural networks are the state-of-the-art methods for many real-world
tasks, such as computer vision, natural language processing and speech
recognition. For all its popularity, deep neural networks are also criticized
for consuming a lot of memory and draining battery life of devices during
training and inference. This makes it hard to deploy these models on mobile or
embedded devices which have tight resource constraints. Quantization is
recognized as one of the most effective approaches to satisfy the extreme
memory requirements that deep neural network models demand. Instead of adopting
32-bit floating point format to represent weights, quantized representations
store weights using more compact formats such as integers or even binary
numbers. Despite a possible degradation in predictive performance, quantization
provides a potential solution to greatly reduce the model size and the energy
consumption. In this survey, we give a thorough review of different aspects of
quantized neural networks. Current challenges and trends of quantized neural
networks are also discussed.Comment: 17 pages, 8 figure
Modified Diversity of Class Probability Estimation Co-training for Hyperspectral Image Classification
Due to the limited amount and imbalanced classes of labeled training data,
the conventional supervised learning can not ensure the discrimination of the
learned feature for hyperspectral image (HSI) classification. In this paper, we
propose a modified diversity of class probability estimation (MDCPE) with two
deep neural networks to learn spectral-spatial feature for HSI classification.
In co-training phase, recurrent neural network (RNN) and convolutional neural
network (CNN) are utilized as two learners to extract features from labeled and
unlabeled data. Based on the extracted features, MDCPE selects most credible
samples to update initial labeled data by combining k-means clustering with the
traditional diversity of class probability estimation (DCPE) co-training. In
this way, MDCPE can keep new labeled data class-balanced and extract
discriminative features for both the minority and majority classes. During
testing process, classification results are acquired by co-decision of the two
learners. Experimental results demonstrate that the proposed semi-supervised
co-training method can make full use of unlabeled information to enhance
generality of the learners and achieve favorable accuracies on all three widely
used data sets: Salinas, Pavia University and Pavia Center.Comment: 13 pages, 10 figures and 8 table
- …