6,840 research outputs found
ImageGCN: Multi-Relational Image Graph Convolutional Networks for Disease Identification with Chest X-rays
Image representation is a fundamental task in computer vision. However, most
of the existing approaches for image representation ignore the relations
between images and consider each input image independently. Intuitively,
relations between images can help to understand the images and maintain model
consistency over related images. In this paper, we consider modeling the
image-level relations to generate more informative image representations, and
propose ImageGCN, an end-to-end graph convolutional network framework for
multi-relational image modeling. We also apply ImageGCN to chest X-ray (CXR)
images where rich relational information is available for disease
identification. Unlike previous image representation models, ImageGCN learns
the representation of an image using both its original pixel features and the
features of related images. Besides learning informative representations for
images, ImageGCN can also be used for object detection in a weakly supervised
manner. The Experimental results on ChestX-ray14 dataset demonstrate that
ImageGCN can outperform respective baselines in both disease identification and
localization tasks and can achieve comparable and often better results than the
state-of-the-art methods
Physically-interpretable classification of biological network dynamics for complex collective motions
Understanding biological network dynamics is a fundamental issue in various
scientific and engineering fields. Network theory is capable of revealing the
relationship between elements and their propagation; however, for complex
collective motions, the network properties often transiently and complexly
change. A fundamental question addressed here pertains to the classification of
collective motion network based on physically-interpretable dynamical
properties. Here we apply a data-driven spectral analysis called graph dynamic
mode decomposition, which obtains the dynamical properties for collective
motion classification. Using a ballgame as an example, we classified the
strategic collective motions in different global behaviours and discovered
that, in addition to the physical properties, the contextual node information
was critical for classification. Furthermore, we discovered the label-specific
stronger spectra in the relationship among the nearest agents, providing
physical and semantic interpretations. Our approach contributes to the
understanding of principles of biological complex network dynamics from the
perspective of nonlinear dynamical systems.Comment: 42 pages with 7 figures and 3 tables. The latest version is published
in Scientific Reports, 202
Motif-based Convolutional Neural Network on Graphs
This paper introduces a generalization of Convolutional Neural Networks
(CNNs) to graphs with irregular linkage structures, especially heterogeneous
graphs with typed nodes and schemas. We propose a novel spatial convolution
operation to model the key properties of local connectivity and translation
invariance, using high-order connection patterns or motifs. We develop a novel
deep architecture Motif-CNN that employs an attention model to combine the
features extracted from multiple patterns, thus effectively capturing
high-order structural and feature information. Our experiments on
semi-supervised node classification on real-world social networks and multiple
representative heterogeneous graph datasets indicate significant gains of 6-21%
over existing graph CNNs and other state-of-the-art techniques
A literature survey of matrix methods for data science
Efficient numerical linear algebra is a core ingredient in many applications
across almost all scientific and industrial disciplines. With this survey we
want to illustrate that numerical linear algebra has played and is playing a
crucial role in enabling and improving data science computations with many new
developments being fueled by the availability of data and computing resources.
We highlight the role of various different factorizations and the power of
changing the representation of the data as well as discussing topics such as
randomized algorithms, functions of matrices, and high-dimensional problems. We
briefly touch upon the role of techniques from numerical linear algebra used
within deep learning
Modeling polypharmacy side effects with graph convolutional networks
The use of drug combinations, termed polypharmacy, is common to treat
patients with complex diseases and co-existing conditions. However, a major
consequence of polypharmacy is a much higher risk of adverse side effects for
the patient. Polypharmacy side effects emerge because of drug-drug
interactions, in which activity of one drug may change if taken with another
drug. The knowledge of drug interactions is limited because these complex
relationships are rare, and are usually not observed in relatively small
clinical testing. Discovering polypharmacy side effects thus remains an
important challenge with significant implications for patient mortality. Here,
we present Decagon, an approach for modeling polypharmacy side effects. The
approach constructs a multimodal graph of protein-protein interactions,
drug-protein target interactions, and the polypharmacy side effects, which are
represented as drug-drug interactions, where each side effect is an edge of a
different type. Decagon is developed specifically to handle such multimodal
graphs with a large number of edge types. Our approach develops a new graph
convolutional neural network for multirelational link prediction in multimodal
networks. Decagon predicts the exact side effect, if any, through which a given
drug combination manifests clinically. Decagon accurately predicts polypharmacy
side effects, outperforming baselines by up to 69%. We find that it
automatically learns representations of side effects indicative of
co-occurrence of polypharmacy in patients. Furthermore, Decagon models
particularly well side effects with a strong molecular basis, while on
predominantly non-molecular side effects, it achieves good performance because
of effective sharing of model parameters across edge types. Decagon creates
opportunities to use large pharmacogenomic and patient data to flag and
prioritize side effects for follow-up analysis.Comment: Presented at ISMB 201
Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks
The past few years have witnessed growth in the computational requirements
for training deep convolutional neural networks. Current approaches parallelize
training onto multiple devices by applying a single parallelization strategy
(e.g., data or model parallelism) to all layers in a network. Although easy to
reason about, these approaches result in suboptimal runtime performance in
large-scale distributed training, since different layers in a network may
prefer different parallelization strategies. In this paper, we propose
layer-wise parallelism that allows each layer in a network to use an individual
parallelization strategy. We jointly optimize how each layer is parallelized by
solving a graph search problem. Our evaluation shows that layer-wise
parallelism outperforms state-of-the-art approaches by increasing training
throughput, reducing communication costs, achieving better scalability to
multiple GPUs, while maintaining original network accuracy
Telugu OCR Framework using Deep Learning
In this paper, we address the task of Optical Character Recognition(OCR) for
the Telugu script. We present an end-to-end framework that segments the text
image, classifies the characters and extracts lines using a language model. The
segmentation is based on mathematical morphology. The classification module,
which is the most challenging task of the three, is a deep convolutional neural
network. The language is modelled as a third degree markov chain at the glyph
level. Telugu script is a complex alphasyllabary and the language is
agglutinative, making the problem hard. In this paper we apply the latest
advances in neural networks to achieve state-of-the-art error rates. We also
review convolutional neural networks in great detail and expound the
statistical justification behind the many tricks needed to make Deep Learning
work
Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design
Deep convolutional networks have witnessed unprecedented success in various
machine learning applications. Formal understanding on what makes these
networks so successful is gradually unfolding, but for the most part there are
still significant mysteries to unravel. The inductive bias, which reflects
prior knowledge embedded in the network architecture, is one of them. In this
work, we establish a fundamental connection between the fields of quantum
physics and deep learning. We use this connection for asserting novel
theoretical observations regarding the role that the number of channels in each
layer of the convolutional network fulfills in the overall inductive bias.
Specifically, we show an equivalence between the function realized by a deep
convolutional arithmetic circuit (ConvAC) and a quantum many-body wave
function, which relies on their common underlying tensorial structure. This
facilitates the use of quantum entanglement measures as well-defined
quantifiers of a deep network's expressive ability to model intricate
correlation structures of its inputs. Most importantly, the construction of a
deep ConvAC in terms of a Tensor Network is made available. This description
enables us to carry a graph-theoretic analysis of a convolutional network, with
which we demonstrate a direct control over the inductive bias of the deep
network via its channel numbers, that are related to the min-cut in the
underlying graph. This result is relevant to any practitioner designing a
network for a specific task. We theoretically analyze ConvACs, and empirically
validate our findings on more common ConvNets which involve ReLU activations
and max pooling. Beyond the results described above, the description of a deep
convolutional network in well-defined graph-theoretic tools and the formal
connection to quantum entanglement, are two interdisciplinary bridges that are
brought forth by this work
Deep Learning Approach on Information Diffusion in Heterogeneous Networks
There are many real-world knowledge based networked systems with multi-type
interacting entities that can be regarded as heterogeneous networks including
human connections and biological evolutions. One of the main issues in such
networks is to predict information diffusion such as shape, growth and size of
social events and evolutions in the future. While there exist a variety of
works on this topic mainly using a threshold-based approach, they suffer from
the local viewpoint on the network and sensitivity to the threshold parameters.
In this paper, information diffusion is considered through a latent
representation learning of the heterogeneous networks to encode in a deep
learning model. To this end, we propose a novel meta-path representation
learning approach, Heterogeneous Deep Diffusion(HDD), to exploit meta-paths as
main entities in networks. At first, the functional heterogeneous structures of
the network are learned by a continuous latent representation through
traversing meta-paths with the aim of global end-to-end viewpoint. Then, the
well-known deep learning architectures are employed on our generated features
to predict diffusion processes in the network. The proposed approach enables us
to apply it on different information diffusion tasks such as topic diffusion
and cascade prediction. We demonstrate the proposed approach on benchmark
network datasets through the well-known evaluation measures. The experimental
results show that our approach outperforms the earlier state-of-the-art
methods
TensorFlow: A system for large-scale machine learning
TensorFlow is a machine learning system that operates at large scale and in
heterogeneous environments. TensorFlow uses dataflow graphs to represent
computation, shared state, and the operations that mutate that state. It maps
the nodes of a dataflow graph across many machines in a cluster, and within a
machine across multiple computational devices, including multicore CPUs,
general-purpose GPUs, and custom designed ASICs known as Tensor Processing
Units (TPUs). This architecture gives flexibility to the application developer:
whereas in previous "parameter server" designs the management of shared state
is built into the system, TensorFlow enables developers to experiment with
novel optimizations and training algorithms. TensorFlow supports a variety of
applications, with particularly strong support for training and inference on
deep neural networks. Several Google services use TensorFlow in production, we
have released it as an open-source project, and it has become widely used for
machine learning research. In this paper, we describe the TensorFlow dataflow
model in contrast to existing systems, and demonstrate the compelling
performance that TensorFlow achieves for several real-world applications.Comment: 18 pages, 9 figures; v2 has a spelling correction in the metadat
- …