Search CORE

6,840 research outputs found

ImageGCN: Multi-Relational Image Graph Convolutional Networks for Disease Identification with Chest X-rays

Author: Luo Yuan
Mao Chengsheng
Yao Liang
Publication venue
Publication date: 30/03/2019
Field of study

Image representation is a fundamental task in computer vision. However, most of the existing approaches for image representation ignore the relations between images and consider each input image independently. Intuitively, relations between images can help to understand the images and maintain model consistency over related images. In this paper, we consider modeling the image-level relations to generate more informative image representations, and propose ImageGCN, an end-to-end graph convolutional network framework for multi-relational image modeling. We also apply ImageGCN to chest X-ray (CXR) images where rich relational information is available for disease identification. Unlike previous image representation models, ImageGCN learns the representation of an image using both its original pixel features and the features of related images. Besides learning informative representations for images, ImageGCN can also be used for object detection in a weakly supervised manner. The Experimental results on ChestX-ray14 dataset demonstrate that ImageGCN can outperform respective baselines in both disease identification and localization tasks and can achieve comparable and often better results than the state-of-the-art methods

arXiv.org e-Print Archive

Physically-interpretable classification of biological network dynamics for complex collective motions

Author: Fujii Keisuke
Hojo Motokazu
Inaba Yuki
Kawahara Yoshinobu
Takeishi Naoya
Publication venue
Publication date: 13/06/2020
Field of study

Understanding biological network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion network based on physically-interpretable dynamical properties. Here we apply a data-driven spectral analysis called graph dynamic mode decomposition, which obtains the dynamical properties for collective motion classification. Using a ballgame as an example, we classified the strategic collective motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification. Furthermore, we discovered the label-specific stronger spectra in the relationship among the nearest agents, providing physical and semantic interpretations. Our approach contributes to the understanding of principles of biological complex network dynamics from the perspective of nonlinear dynamical systems.Comment: 42 pages with 7 figures and 3 tables. The latest version is published in Scientific Reports, 202

arXiv.org e-Print Archive

Motif-based Convolutional Neural Network on Graphs

Author: Chang Kevin Chen-Chuan
Sankar Aravind
Zhang Xinyang
Publication venue
Publication date: 21/07/2019
Field of study

This paper introduces a generalization of Convolutional Neural Networks (CNNs) to graphs with irregular linkage structures, especially heterogeneous graphs with typed nodes and schemas. We propose a novel spatial convolution operation to model the key properties of local connectivity and translation invariance, using high-order connection patterns or motifs. We develop a novel deep architecture Motif-CNN that employs an attention model to combine the features extracted from multiple patterns, thus effectively capturing high-order structural and feature information. Our experiments on semi-supervised node classification on real-world social networks and multiple representative heterogeneous graph datasets indicate significant gains of 6-21% over existing graph CNNs and other state-of-the-art techniques

arXiv.org e-Print Archive

A literature survey of matrix methods for data science

Author: Stoll Martin
Publication venue
Publication date: 29/05/2020
Field of study

Efficient numerical linear algebra is a core ingredient in many applications across almost all scientific and industrial disciplines. With this survey we want to illustrate that numerical linear algebra has played and is playing a crucial role in enabling and improving data science computations with many new developments being fueled by the availability of data and computing resources. We highlight the role of various different factorizations and the power of changing the representation of the data as well as discussing topics such as randomized algorithms, functions of matrices, and high-dimensional problems. We briefly touch upon the role of techniques from numerical linear algebra used within deep learning

arXiv.org e-Print Archive

Modeling polypharmacy side effects with graph convolutional networks

Author: Agrawal Monica
Leskovec Jure
Zitnik Marinka
Publication venue: 'Oxford University Press (OUP)'
Publication date: 18/05/2018
Field of study

The use of drug combinations, termed polypharmacy, is common to treat patients with complex diseases and co-existing conditions. However, a major consequence of polypharmacy is a much higher risk of adverse side effects for the patient. Polypharmacy side effects emerge because of drug-drug interactions, in which activity of one drug may change if taken with another drug. The knowledge of drug interactions is limited because these complex relationships are rare, and are usually not observed in relatively small clinical testing. Discovering polypharmacy side effects thus remains an important challenge with significant implications for patient mortality. Here, we present Decagon, an approach for modeling polypharmacy side effects. The approach constructs a multimodal graph of protein-protein interactions, drug-protein target interactions, and the polypharmacy side effects, which are represented as drug-drug interactions, where each side effect is an edge of a different type. Decagon is developed specifically to handle such multimodal graphs with a large number of edge types. Our approach develops a new graph convolutional neural network for multirelational link prediction in multimodal networks. Decagon predicts the exact side effect, if any, through which a given drug combination manifests clinically. Decagon accurately predicts polypharmacy side effects, outperforming baselines by up to 69%. We find that it automatically learns representations of side effects indicative of co-occurrence of polypharmacy in patients. Furthermore, Decagon models particularly well side effects with a strong molecular basis, while on predominantly non-molecular side effects, it achieves good performance because of effective sharing of model parameters across edge types. Decagon creates opportunities to use large pharmacogenomic and patient data to flag and prioritize side effects for follow-up analysis.Comment: Presented at ISMB 201

arXiv.org e-Print Archive

Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks

Author: Aiken Alex
Jia Zhihao
Lin Sina
Qi Charles R.
Publication venue
Publication date: 09/06/2018
Field of study

The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g., data or model parallelism) to all layers in a network. Although easy to reason about, these approaches result in suboptimal runtime performance in large-scale distributed training, since different layers in a network may prefer different parallelization strategies. In this paper, we propose layer-wise parallelism that allows each layer in a network to use an individual parallelization strategy. We jointly optimize how each layer is parallelized by solving a graph search problem. Our evaluation shows that layer-wise parallelism outperforms state-of-the-art approaches by increasing training throughput, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining original network accuracy

arXiv.org e-Print Archive

Telugu OCR Framework using Deep Learning

Author: Achanta Rakesh
Hastie Trevor
Publication venue
Publication date: 14/02/2017
Field of study

In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work

arXiv.org e-Print Archive

Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design

Author: Cohen Nadav
Levine Yoav
Shashua Amnon
Yakira David
Publication venue
Publication date: 10/04/2017
Field of study

Deep convolutional networks have witnessed unprecedented success in various machine learning applications. Formal understanding on what makes these networks so successful is gradually unfolding, but for the most part there are still significant mysteries to unravel. The inductive bias, which reflects prior knowledge embedded in the network architecture, is one of them. In this work, we establish a fundamental connection between the fields of quantum physics and deep learning. We use this connection for asserting novel theoretical observations regarding the role that the number of channels in each layer of the convolutional network fulfills in the overall inductive bias. Specifically, we show an equivalence between the function realized by a deep convolutional arithmetic circuit (ConvAC) and a quantum many-body wave function, which relies on their common underlying tensorial structure. This facilitates the use of quantum entanglement measures as well-defined quantifiers of a deep network's expressive ability to model intricate correlation structures of its inputs. Most importantly, the construction of a deep ConvAC in terms of a Tensor Network is made available. This description enables us to carry a graph-theoretic analysis of a convolutional network, with which we demonstrate a direct control over the inductive bias of the deep network via its channel numbers, that are related to the min-cut in the underlying graph. This result is relevant to any practitioner designing a network for a specific task. We theoretically analyze ConvACs, and empirically validate our findings on more common ConvNets which involve ReLU activations and max pooling. Beyond the results described above, the description of a deep convolutional network in well-defined graph-theoretic tools and the formal connection to quantum entanglement, are two interdisciplinary bridges that are brought forth by this work

arXiv.org e-Print Archive

Deep Learning Approach on Information Diffusion in Heterogeneous Networks

Author: Molaei Soheila
Veisi Hadi
Zare Hadi
Publication venue: 'Elsevier BV'
Publication date: 02/11/2019
Field of study

There are many real-world knowledge based networked systems with multi-type interacting entities that can be regarded as heterogeneous networks including human connections and biological evolutions. One of the main issues in such networks is to predict information diffusion such as shape, growth and size of social events and evolutions in the future. While there exist a variety of works on this topic mainly using a threshold-based approach, they suffer from the local viewpoint on the network and sensitivity to the threshold parameters. In this paper, information diffusion is considered through a latent representation learning of the heterogeneous networks to encode in a deep learning model. To this end, we propose a novel meta-path representation learning approach, Heterogeneous Deep Diffusion(HDD), to exploit meta-paths as main entities in networks. At first, the functional heterogeneous structures of the network are learned by a continuous latent representation through traversing meta-paths with the aim of global end-to-end viewpoint. Then, the well-known deep learning architectures are employed on our generated features to predict diffusion processes in the network. The proposed approach enables us to apply it on different information diffusion tasks such as topic diffusion and cascade prediction. We demonstrate the proposed approach on benchmark network datasets through the well-known evaluation measures. The experimental results show that our approach outperforms the earlier state-of-the-art methods

arXiv.org e-Print Archive

TensorFlow: A system for large-scale machine learning

Author: Abadi Martín
Barham Paul
Chen Jianmin
Chen Zhifeng
Davis Andy
Dean Jeffrey
Devin Matthieu
Ghemawat Sanjay
Irving Geoffrey
Isard Michael
Kudlur Manjunath
Levenberg Josh
Monga Rajat
Moore Sherry
Murray Derek G.
Steiner Benoit
Tucker Paul
Vasudevan Vijay
Warden Pete
Wicke Martin
Yu Yuan
Zheng Xiaoqiang
Publication venue
Publication date: 31/05/2016
Field of study

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.Comment: 18 pages, 9 figures; v2 has a spelling correction in the metadat

arXiv.org e-Print Archive