60,221 research outputs found
Graph Attention Auto-Encoders
Auto-encoders have emerged as a successful framework for unsupervised
learning. However, conventional auto-encoders are incapable of utilizing
explicit relations in structured data. To take advantage of relations in
graph-structured data, several graph auto-encoders have recently been proposed,
but they neglect to reconstruct either the graph structure or node attributes.
In this paper, we present the graph attention auto-encoder (GATE), a neural
network architecture for unsupervised representation learning on
graph-structured data. Our architecture is able to reconstruct graph-structured
inputs, including both node attributes and the graph structure, through stacked
encoder/decoder layers equipped with self-attention mechanisms. In the encoder,
by considering node attributes as initial node representations, each layer
generates new representations of nodes by attending over their neighbors'
representations. In the decoder, we attempt to reverse the encoding process to
reconstruct node attributes. Moreover, node representations are regularized to
reconstruct the graph structure. Our proposed architecture does not need to
know the graph structure upfront, and thus it can be applied to inductive
learning. Our experiments demonstrate competitive performance on several node
classification benchmark datasets for transductive and inductive tasks, even
exceeding the performance of supervised learning baselines in most cases
ImageGCN: Multi-Relational Image Graph Convolutional Networks for Disease Identification with Chest X-rays
Image representation is a fundamental task in computer vision. However, most
of the existing approaches for image representation ignore the relations
between images and consider each input image independently. Intuitively,
relations between images can help to understand the images and maintain model
consistency over related images. In this paper, we consider modeling the
image-level relations to generate more informative image representations, and
propose ImageGCN, an end-to-end graph convolutional network framework for
multi-relational image modeling. We also apply ImageGCN to chest X-ray (CXR)
images where rich relational information is available for disease
identification. Unlike previous image representation models, ImageGCN learns
the representation of an image using both its original pixel features and the
features of related images. Besides learning informative representations for
images, ImageGCN can also be used for object detection in a weakly supervised
manner. The Experimental results on ChestX-ray14 dataset demonstrate that
ImageGCN can outperform respective baselines in both disease identification and
localization tasks and can achieve comparable and often better results than the
state-of-the-art methods
Table understanding in structured documents
Abstract--- Table detection and extraction has been studied in the context of
documents like reports, where tables are clearly outlined and stand out from
the document structure visually. We study this topic in a rather more
challenging domain of layout-heavy business documents, particularly invoices.
Invoices present the novel challenges of tables being often without outlines -
either in the form of borders or surrounding text flow - with ragged columns
and widely varying data content. We will also show, that we can extract
specific information from structurally different tables or table-like
structures with one model. We present a comprehensive representation of a page
using graph over word boxes, positional embeddings, trainable textual features
and rephrase the table detection as a text box labeling problem. We will work
on our newly presented dataset of pro forma invoices, invoices and debit note
documents using this representation and propose multiple baselines to solve
this labeling problem. We then propose a novel neural network model that
achieves strong, practical results on the presented dataset and analyze the
model performance and effects of graph convolutions and self-attention in
detail.Comment: Changed from previous version based on icdar2019 feedback to include
6 pages, 2 figures. Slightly changed paper name and abstract to be less
misleading. Corrected grammar and shortened content heavily, corrected
misleading information and readability. Currently in review for icdar2019-wml
subconference/worksho
A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)
The recent advance in neural network architecture and training algorithms
have shown the effectiveness of representation learning. The neural
network-based models generate better representation than the traditional ones.
They have the ability to automatically learn the distributed representation for
sentences and documents. To this end, we proposed a novel model that addresses
several issues that are not adequately modeled by the previously proposed
models, such as the memory problem and incorporating the knowledge of document
structure. Our model uses a hierarchical structured self-attention mechanism to
create the sentence and document embeddings. This architecture mirrors the
hierarchical structure of the document and in turn enables us to obtain better
feature representation. The attention mechanism provides extra source of
information to guide the summary extraction. The new model treated the
summarization task as a classification problem in which the model computes the
respective probabilities of sentence-summary membership. The model predictions
are broken up by several features such as information content, salience,
novelty and positional representation. The proposed model was evaluated on two
well-known datasets, the CNN / Daily Mail, and DUC 2002. The experimental
results show that our model outperforms the current extractive state-of-the-art
by a considerable margin.Comment: 8 pages, 4 figures, 2 tables, IEEE Access, pp. 1-1, 201
GrAMME: Semi-Supervised Learning using Multi-layered Graph Attention Models
Modern data analysis pipelines are becoming increasingly complex due to the
presence of multi-view information sources. While graphs are effective in
modeling complex relationships, in many scenarios a single graph is rarely
sufficient to succinctly represent all interactions, and hence multi-layered
graphs have become popular. Though this leads to richer representations,
extending solutions from the single-graph case is not straightforward.
Consequently, there is a strong need for novel solutions to solve classical
problems, such as node classification, in the multi-layered case. In this
paper, we consider the problem of semi-supervised learning with multi-layered
graphs. Though deep network embeddings, e.g. DeepWalk, are widely adopted for
community discovery, we argue that feature learning with random node
attributes, using graph neural networks, can be more effective. To this end, we
propose to use attention models for effective feature learning, and develop two
novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the inter-layer
dependencies for building multi-layered graph embeddings. Using empirical
studies on several benchmark datasets, we evaluate the proposed approaches and
demonstrate significant performance improvements in comparison to
state-of-the-art network embedding strategies. The results also show that using
simple random features is an effective choice, even in cases where explicit
node attributes are not available
Relational Deep Reinforcement Learning
We introduce an approach for deep reinforcement learning (RL) that improves
upon the efficiency, generalization capacity, and interpretability of
conventional approaches through structured perception and relational reasoning.
It uses self-attention to iteratively reason about the relations between
entities in a scene and to guide a model-free policy. Our results show that in
a novel navigation and planning task called Box-World, our agent finds
interpretable solutions that improve upon baselines in terms of sample
complexity, ability to generalize to more complex scenes than experienced
during training, and overall performance. In the StarCraft II Learning
Environment, our agent achieves state-of-the-art performance on six mini-games
-- surpassing human grandmaster performance on four. By considering
architectural inductive biases, our work opens new directions for overcoming
important, but stubborn, challenges in deep RL
Relational inductive biases, deep learning, and graph networks
Artificial intelligence (AI) has undergone a renaissance recently, making
major progress in key domains such as vision, language, control, and
decision-making. This has been due, in part, to cheap data and cheap compute
resources, which have fit the natural strengths of deep learning. However, many
defining characteristics of human intelligence, which developed under much
different pressures, remain out of reach for current approaches. In particular,
generalizing beyond one's experiences--a hallmark of human intelligence from
infancy--remains a formidable challenge for modern AI.
The following is part position paper, part review, and part unification. We
argue that combinatorial generalization must be a top priority for AI to
achieve human-like abilities, and that structured representations and
computations are key to realizing this objective. Just as biology uses nature
and nurture cooperatively, we reject the false choice between
"hand-engineering" and "end-to-end" learning, and instead advocate for an
approach which benefits from their complementary strengths. We explore how
using relational inductive biases within deep learning architectures can
facilitate learning about entities, relations, and rules for composing them. We
present a new building block for the AI toolkit with a strong relational
inductive bias--the graph network--which generalizes and extends various
approaches for neural networks that operate on graphs, and provides a
straightforward interface for manipulating structured knowledge and producing
structured behaviors. We discuss how graph networks can support relational
reasoning and combinatorial generalization, laying the foundation for more
sophisticated, interpretable, and flexible patterns of reasoning. As a
companion to this paper, we have released an open-source software library for
building graph networks, with demonstrations of how to use them in practice
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Graphs that capture relations between textual units have great benefits for
detecting salient information from multiple documents and generating overall
coherent summaries. In this paper, we develop a neural abstractive
multi-document summarization (MDS) model which can leverage well-known graph
representations of documents such as similarity graph and discourse graph, to
more effectively process multiple input documents and produce abstractive
summaries. Our model utilizes graphs to encode documents in order to capture
cross-document relations, which is crucial to summarizing long documents. Our
model can also take advantage of graphs to guide the summary generation
process, which is beneficial for generating coherent and concise summaries.
Furthermore, pre-trained language models can be easily combined with our model,
which further improve the summarization performance significantly. Empirical
results on the WikiSum and MultiNews dataset show that the proposed
architecture brings substantial improvements over several strong baselines.Comment: Accepted by ACL202
An Introductory Survey on Attention Mechanisms in NLP Problems
First derived from human intuition, later adapted to machine translation for
automatic token alignment, attention mechanism, a simple method that can be
used for encoding sequence data based on the importance score each element is
assigned, has been widely applied to and attained significant improvement in
various tasks in natural language processing, including sentiment
classification, text summarization, question answering, dependency parsing,
etc. In this paper, we survey through recent works and conduct an introductory
summary of the attention mechanism in different NLP problems, aiming to provide
our readers with basic knowledge on this widely used method, discuss its
different variants for different tasks, explore its association with other
techniques in machine learning, and examine methods for evaluating its
performance.Comment: 9 page
Text Generation from Knowledge Graphs with Graph Transformers
Generating texts which express complex ideas spanning multiple sentences
requires a structured representation of their content (document plan), but
these representations are prohibitively expensive to manually produce. In this
work, we address the problem of generating coherent multi-sentence texts from
the output of an information extraction system, and in particular a knowledge
graph. Graphical knowledge representations are ubiquitous in computing, but
pose a significant challenge for text generation techniques due to their
non-hierarchical nature, collapsing of long-distance dependencies, and
structural variety. We introduce a novel graph transforming encoder which can
leverage the relational structure of such knowledge graphs without imposing
linearization or hierarchical constraints. Incorporated into an encoder-decoder
setup, we provide an end-to-end trainable system for graph-to-text generation
that we apply to the domain of scientific text. Automatic and human evaluations
show that our technique produces more informative texts which exhibit better
document structure than competitive encoder-decoder methods.Comment: Accepted as a long paper in NAACL 201
- …