14,378 research outputs found
How convolutional neural network see the world - A survey of convolutional neural network visualization methods
Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive
performance on many computer vision related tasks, such as object detection,
image recognition, image retrieval, etc. These achievements benefit from the
CNNs outstanding capability to learn the input features with deep layers of
neuron structures and iterative training process. However, these learned
features are hard to identify and interpret from a human vision perspective,
causing a lack of understanding of the CNNs internal working mechanism. To
improve the CNN interpretability, the CNN visualization is well utilized as a
qualitative analysis method, which translates the internal features into
visually perceptible patterns. And many CNN visualization works have been
proposed in the literature to interpret the CNN in perspectives of network
structure, operation, and semantic concept. In this paper, we expect to provide
a comprehensive survey of several representative CNN visualization methods,
including Activation Maximization, Network Inversion, Deconvolutional Neural
Networks (DeconvNet), and Network Dissection based visualization. These methods
are presented in terms of motivations, algorithms, and experiment results.
Based on these visualization methods, we also discuss their practical
applications to demonstrate the significance of the CNN interpretability in
areas of network design, optimization, security enhancement, etc.Comment: 32 pages, 21 figures. Mathematical Foundations of Computin
Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers
Deep learning has recently seen rapid development and received significant
attention due to its state-of-the-art performance on previously-thought hard
problems. However, because of the internal complexity and nonlinear structure
of deep neural networks, the underlying decision making processes for why these
models are achieving such performance are challenging and sometimes mystifying
to interpret. As deep learning spreads across domains, it is of paramount
importance that we equip users of deep learning with tools for understanding
when a model works correctly, when it fails, and ultimately how to improve its
performance. Standardized toolkits for building neural networks have helped
democratize deep learning; visual analytics systems have now been developed to
support model explanation, interpretation, debugging, and improvement. We
present a survey of the role of visual analytics in deep learning research,
which highlights its short yet impactful history and thoroughly summarizes the
state-of-the-art using a human-centered interrogative framework, focusing on
the Five W's and How (Why, Who, What, How, When, and Where). We conclude by
highlighting research directions and open research problems. This survey helps
researchers and practitioners in both visual analytics and deep learning to
quickly learn key aspects of this young and rapidly growing body of research,
whose impact spans a diverse range of domains.Comment: Under review for IEEE Transactions on Visualization and Computer
Graphics (TVCG
Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
Spatiotemporal forecasting has various applications in neuroscience, climate
and transportation domain. Traffic forecasting is one canonical example of such
learning task. The task is challenging due to (1) complex spatial dependency on
road networks, (2) non-linear temporal dynamics with changing road conditions
and (3) inherent difficulty of long-term forecasting. To address these
challenges, we propose to model the traffic flow as a diffusion process on a
directed graph and introduce Diffusion Convolutional Recurrent Neural Network
(DCRNN), a deep learning framework for traffic forecasting that incorporates
both spatial and temporal dependency in the traffic flow. Specifically, DCRNN
captures the spatial dependency using bidirectional random walks on the graph,
and the temporal dependency using the encoder-decoder architecture with
scheduled sampling. We evaluate the framework on two real-world large scale
road network traffic datasets and observe consistent improvement of 12% - 15%
over state-of-the-art baselines.Comment: Published as a conference paper at ICLR 201
Functionality-Oriented Convolutional Filter Pruning
The sophisticated structure of Convolutional Neural Network (CNN) allows for
outstanding performance, but at the cost of intensive computation. As
significant redundancies inevitably present in such a structure, many works
have been proposed to prune the convolutional filters for computation cost
reduction. Although extremely effective, most works are based only on
quantitative characteristics of the convolutional filters, and highly overlook
the qualitative interpretation of individual filter's specific functionality.
In this work, we interpreted the functionality and redundancy of the
convolutional filters from different perspectives, and proposed a
functionality-oriented filter pruning method. With extensive experiment
results, we proved the convolutional filters' qualitative significance
regardless of magnitude, demonstrated significant neural network redundancy due
to repetitive filter functions, and analyzed the filter functionality defection
under inappropriate retraining process. Such an interpretable pruning approach
not only offers outstanding computation cost optimization over previous filter
pruning methods, but also interprets filter pruning process
A Brief Survey and an Application of Semantic Image Segmentation for Autonomous Driving
Deep learning is a fast-growing machine learning approach to perceive and
understand large amounts of data. In this paper, general information about the
deep learning approach which is attracted much attention in the field of
machine learning is given in recent years and an application about semantic
image segmentation is carried out in order to help autonomous driving of
autonomous vehicles. This application is implemented with Fully Convolutional
Network (FCN) architectures obtained by modifying the Convolutional Neural
Network (CNN) architectures based on deep learning. Experimental studies for
the application are utilized 4 different FCN architectures named
FCN-AlexNet,FCN-8s, FCN-16s and FCN-32s. For the experimental studies, FCNs are
first trained separately and validation accuracies of these trained network
models on the used dataset is compared. In addition, image segmentation
inferences are visualized to take account of how precisely FCN architectures
can segment objects.Comment: A chapter for Springer Book: Handbook of Deep Learning Applications,
2018,[ Pijush Samui, Editor]. (be published
Representation Learning on Graphs: Methods and Applications
Machine learning on graphs is an important and ubiquitous task with
applications ranging from drug design to friendship recommendation in social
networks. The primary challenge in this domain is finding a way to represent,
or encode, graph structure so that it can be easily exploited by machine
learning models. Traditionally, machine learning approaches relied on
user-defined heuristics to extract features encoding structural information
about a graph (e.g., degree statistics or kernel functions). However, recent
years have seen a surge in approaches that automatically learn to encode graph
structure into low-dimensional embeddings, using techniques based on deep
learning and nonlinear dimensionality reduction. Here we provide a conceptual
review of key advancements in this area of representation learning on graphs,
including matrix factorization-based methods, random-walk based algorithms, and
graph neural networks. We review methods to embed individual nodes as well as
approaches to embed entire (sub)graphs. In doing so, we develop a unified
framework to describe these recent approaches, and we highlight a number of
important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017;
version with minor correction
An Attention-Gated Convolutional Neural Network for Sentence Classification
The classification of sentences is very challenging, since sentences contain
the limited contextual information. In this paper, we proposed an
Attention-Gated Convolutional Neural Network (AGCNN) for sentence
classification, which generates attention weights from the feature's context
windows of different sizes by using specialized convolution encoders. It makes
full use of limited contextual information to extract and enhance the influence
of important features in predicting the sentence's category. Experimental
results demonstrated that our model can achieve up to 3.1% higher accuracy than
standard CNN models, and gain competitive results over the baselines on four
out of the six tasks. Besides, we designed an activation function, namely,
Natural Logarithm rescaled Rectified Linear Unit (NLReLU). Experiments showed
that NLReLU can outperform ReLU and is comparable to other well-known
activation functions on AGCNN.Comment: Accepted for publication in the Intelligent Data Analysis journal, 19
pages, 4 figures and 5 table
Wind Speed Prediction and Visualization Using Long Short-Term Memory Networks (LSTM)
Climate change is one of the most concerning issues of this century. Emission
from electric power generation is a crucial factor that drives the concern to
the next level. Renewable energy sources are widespread and available globally,
however, one of the major challenges is to understand their characteristics in
a more informative way. This paper proposes the prediction of wind speed that
simplifies wind farm planning and feasibility study. Twelve artificial
intelligence algorithms were used for wind speed prediction from collected
meteorological parameters. The model performances were compared to determine
the wind speed prediction accuracy. The results show a deep learning approach,
long short-term memory (LSTM) outperforms other models with the highest
accuracy of 97.8%.Comment: 10th International Conference on Information Science and Technology
(ICIST 2020
Modeling Latent Attention Within Neural Networks
Deep neural networks are able to solve tasks across a variety of domains and
modalities of data. Despite many empirical successes, we lack the ability to
clearly understand and interpret the learned internal mechanisms that
contribute to such effective behaviors or, more critically, failure modes. In
this work, we present a general method for visualizing an arbitrary neural
network's inner mechanisms and their power and limitations. Our dataset-centric
method produces visualizations of how a trained network attends to components
of its inputs. The computed "attention masks" support improved interpretability
by highlighting which input attributes are critical in determining output. We
demonstrate the effectiveness of our framework on a variety of deep neural
network architectures in domains from computer vision, natural language
processing, and reinforcement learning. The primary contribution of our
approach is an interpretable visualization of attention that provides unique
insights into the network's underlying decision-making process irrespective of
the data modality
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
- …