444,124 research outputs found
Context-Aware Self-Attention Networks
Self-attention model have shown its flexibility in parallel computation and
the effectiveness on modeling both long- and short-term dependencies. However,
it calculates the dependencies between representations without considering the
contextual information, which have proven useful for modeling dependencies
among neural representations in various natural language tasks. In this work,
we focus on improving self-attention networks through capturing the richness of
context. To maintain the simplicity and flexibility of the self-attention
networks, we propose to contextualize the transformations of the query and key
layers, which are used to calculates the relevance between elements.
Specifically, we leverage the internal representations that embed both global
and deep contexts, thus avoid relying on external resources. Experimental
results on WMT14 English-German and WMT17 Chinese-English translation tasks
demonstrate the effectiveness and universality of the proposed methods.
Furthermore, we conducted extensive analyses to quantity how the context
vectors participate in the self-attention model.Comment: AAAI 201
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
Visual dialog (VisDial) is a task which requires an AI agent to answer a
series of questions grounded in an image. Unlike in visual question answering
(VQA), the series of questions should be able to capture a temporal context
from a dialog history and exploit visually-grounded information. A problem
called visual reference resolution involves these challenges, requiring the
agent to resolve ambiguous references in a given question and find the
references in a given image. In this paper, we propose Dual Attention Networks
(DAN) for visual reference resolution. DAN consists of two kinds of attention
networks, REFER and FIND. Specifically, REFER module learns latent
relationships between a given question and a dialog history by employing a
self-attention mechanism. FIND module takes image features and reference-aware
representations (i.e., the output of REFER module) as input, and performs
visual grounding via bottom-up attention mechanism. We qualitatively and
quantitatively evaluate our model on VisDial v1.0 and v0.9 datasets, showing
that DAN outperforms the previous state-of-the-art model by a significant
margin.Comment: EMNLP 201
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization
Over the past few years, a significant progress has been made in deep
convolutional neural networks (CNNs)-based image recognition. This is mainly
due to the strong ability of such networks in mining discriminative object pose
and parts information from texture and shape. This is often inappropriate for
fine-grained visual classification (FGVC) since it exhibits high intra-class
and low inter-class variances due to occlusions, deformation, illuminations,
etc. Thus, an expressive feature representation describing global structural
information is a key to characterize an object/ scene. To this end, we propose
a method that effectively captures subtle changes by aggregating context-aware
features from most relevant image-regions and their importance in
discriminating fine-grained categories avoiding the bounding-box and/or
distinguishable part annotations. Our approach is inspired by the recent
advancement in self-attention and graph neural networks (GNNs) approaches to
include a simple yet effective relation-aware feature transformation and its
refinement using a context-aware attention mechanism to boost the
discriminability of the transformed feature in an end-to-end learning process.
Our model is evaluated on eight benchmark datasets consisting of fine-grained
objects and human-object interactions. It outperforms the state-of-the-art
approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization
Over the past few years, a significant progress has been made in deep
convolutional neural networks (CNNs)-based image recognition. This is mainly
due to the strong ability of such networks in mining discriminative object pose
and parts information from texture and shape. This is often inappropriate for
fine-grained visual classification (FGVC) since it exhibits high intra-class
and low inter-class variances due to occlusions, deformation, illuminations,
etc. Thus, an expressive feature representation describing global structural
information is a key to characterize an object/ scene. To this end, we propose
a method that effectively captures subtle changes by aggregating context-aware
features from most relevant image-regions and their importance in
discriminating fine-grained categories avoiding the bounding-box and/or
distinguishable part annotations. Our approach is inspired by the recent
advancement in self-attention and graph neural networks (GNNs) approaches to
include a simple yet effective relation-aware feature transformation and its
refinement using a context-aware attention mechanism to boost the
discriminability of the transformed feature in an end-to-end learning process.
Our model is evaluated on eight benchmark datasets consisting of fine-grained
objects and human-object interactions. It outperforms the state-of-the-art
approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin
Bi-directional block self-attention for fast and memory-efficient sequence modeling
Ā© Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved. Recurrent neural networks (RNN), convolutional neural networks (CNN) and self-attention networks (SAN) are commonly used to produce context-aware representations. RNN can capture long-range dependency but is hard to parallelize and not time-efficient. CNN focuses on local dependency but does not perform well on some tasks. SAN can model both such dependencies via highly parallelizable computation, but memory requirement grows rapidly in line with sequence length. In this paper, we propose a model, called ābi-directional block self-attention network (Bi-BloSAN)ā, for RNN/CNN-free sequence encoding. It requires as little memory as RNN but with all the merits of SAN. Bi-BloSAN splits the entire sequence into blocks, and applies an intra-block SAN to each block for modeling local context, then applies an inter-block SAN to the outputs for all blocks to capture long-range dependency. Thus, each SAN only needs to process a short sequence, and only a small amount of memory is required. Additionally, we use feature-level attention to handle the variation of contexts around the same word, and use forward/backward masks to encode temporal order information. On nine benchmark datasets for different NLP tasks, Bi-BloSAN achieves or improves upon state-of-the-art accuracy, and shows better efficiency-memory trade-off than existing RNN/CNN/SAN
Situation aware intrusion recovery policy in WSNs
Wireless Sensor Networks (WSNs) have been gaining tremendous research attention the last few years as they support a broad range of applications in the context of the Internet of Things. WSN-driven applications greatly depend on the sensorsā observations to support decision-making and respond accordingly to reported critical events. In case of compromisation, it is vital to recover compromised WSN services and continue to operate as expected. To achieve an effective restoration of compromised WSN services, sensors should be equipped with the logic to take recovery decisions and self-heal. Self-healing is challenging as sensors should be aware of a variety of aspects in order to take effective decisions and maximize the recovery benefits. So far situation awareness has not been actively investigated in an intrusion recovery context. This research work formulates situation aware intrusion recovery policy design guidelines in order to drive the design of new intrusion recovery solutions that are operated by an adaptable policy. An adaptable intrusion recovery policy is presented taking into consideration the proposed design guidelines. The evaluation results demonstrate that the proposed policy can address advanced attack strategies and aid the sensors to recover the networkās operation under different attack situations and intrusion recovery requirements
Self-Attention Network for Text Representation Learning
University of Technology Sydney. Faculty of Engineering and Information Technology.This research studies the effectiveness and efficiency of self-attention mechanisms for text representation learning in a deep learning-based natural language processing literature. We focus on developing novel self-attention networks to capture semantic and syntactic knowledge underlying natural language texts, and thus benefit a wide range of downstream natural language understanding tasks, which is followed by improving the networks with external relational, structured, factoid, and commonsense information in knowledge graphs.
In the last decade, recurrent neural networks and convolutional neural networks are widely used to produce context-aware representations for natural language text: the former can capture long-range dependency but is hard to parallelize and not time-efficient; the latter focuses on local dependency but does not perform well on some tasks. Attention mechanisms, especially self-attention mechanisms, have recently attracted tremendous interest from both academia and industry, due to their light-weight structures, parallelizable computations, outstanding performance on a broad spectrum of natural language processing tasks. We first propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). Compared to previous works, the proposed attention mechanism is able to capture the subtle difference in contexts and thus alleviate the ambiguity or polysemy problem. Based solely on the proposed attention, we present a light-weight neural model, directional self-attention network, to learn both token- and sentence-level context-aware representations, with high efficiency and competitive performance. Furthermore, we improve the proposed network along with several directions: First, we extend the self-attention to a hierarchical structure to capture local and global dependencies for memory efficiency. Second, we introduce hard attention to the self-attention mechanism for mutual benefits of soft and hard attention mechanisms. Third, we capture both pairwise and global dependencies by a novel compatibility function composed of dot-product and additive attentions.
Then, this research conducts extensive experiments on benchmark tasks to verify the effectiveness of the proposed self-attention networks from both quantitative and qualitative perspective. The benchmark tasks, including natural language inference, sentiment analysis, semantic role labeling, are able to comprehensively estimate models' capability of capturing both semantic and syntactic information underlying natural language texts. The empirical results show the proposed models empirically achieve state-of-the-art performance on a wide range of natural language understanding tasks, and are as fast and as memory-efficient as convolutional models.
Lastly, although self-attention networks, even if those initialized by a pre-trained language model, learn powerful contextualized representations and achieve state-of-the- art performance, open questions still remain about what these models have learned and improvements can be made along with several directions. One such direction is when downstream task performance depends on relational knowledge - the kind stored in knowledge graphs. Therefore, we explore incorporation of self-attention networks and human-curated knowledge graphs because such knowledge can improve a self-attention network by either conducting symbolic reasoning over knowledge graphs to derive targeted results or embedding the relational information into neural networks to boost representation learning. We study several potential approaches in three knowledge graph-related scenarios in a natural language processing literature, i.e., knowledge-based question answering, knowledge base completion and commonsense reasoning. Experiments conducted on knowledge graph-related benchmarks show the effectiveness of our proposed models
- ā¦