444,124 research outputs found

    Context-Aware Self-Attention Networks

    Full text link
    Self-attention model have shown its flexibility in parallel computation and the effectiveness on modeling both long- and short-term dependencies. However, it calculates the dependencies between representations without considering the contextual information, which have proven useful for modeling dependencies among neural representations in various natural language tasks. In this work, we focus on improving self-attention networks through capturing the richness of context. To maintain the simplicity and flexibility of the self-attention networks, we propose to contextualize the transformations of the query and key layers, which are used to calculates the relevance between elements. Specifically, we leverage the internal representations that embed both global and deep contexts, thus avoid relying on external resources. Experimental results on WMT14 English-German and WMT17 Chinese-English translation tasks demonstrate the effectiveness and universality of the proposed methods. Furthermore, we conducted extensive analyses to quantity how the context vectors participate in the self-attention model.Comment: AAAI 201

    Dual Attention Networks for Visual Reference Resolution in Visual Dialog

    Full text link
    Visual dialog (VisDial) is a task which requires an AI agent to answer a series of questions grounded in an image. Unlike in visual question answering (VQA), the series of questions should be able to capture a temporal context from a dialog history and exploit visually-grounded information. A problem called visual reference resolution involves these challenges, requiring the agent to resolve ambiguous references in a given question and find the references in a given image. In this paper, we propose Dual Attention Networks (DAN) for visual reference resolution. DAN consists of two kinds of attention networks, REFER and FIND. Specifically, REFER module learns latent relationships between a given question and a dialog history by employing a self-attention mechanism. FIND module takes image features and reference-aware representations (i.e., the output of REFER module) as input, and performs visual grounding via bottom-up attention mechanism. We qualitatively and quantitatively evaluate our model on VisDial v1.0 and v0.9 datasets, showing that DAN outperforms the previous state-of-the-art model by a significant margin.Comment: EMNLP 201

    SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization

    Full text link
    Over the past few years, a significant progress has been made in deep convolutional neural networks (CNNs)-based image recognition. This is mainly due to the strong ability of such networks in mining discriminative object pose and parts information from texture and shape. This is often inappropriate for fine-grained visual classification (FGVC) since it exhibits high intra-class and low inter-class variances due to occlusions, deformation, illuminations, etc. Thus, an expressive feature representation describing global structural information is a key to characterize an object/ scene. To this end, we propose a method that effectively captures subtle changes by aggregating context-aware features from most relevant image-regions and their importance in discriminating fine-grained categories avoiding the bounding-box and/or distinguishable part annotations. Our approach is inspired by the recent advancement in self-attention and graph neural networks (GNNs) approaches to include a simple yet effective relation-aware feature transformation and its refinement using a context-aware attention mechanism to boost the discriminability of the transformed feature in an end-to-end learning process. Our model is evaluated on eight benchmark datasets consisting of fine-grained objects and human-object interactions. It outperforms the state-of-the-art approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin

    SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization

    Get PDF
    Over the past few years, a significant progress has been made in deep convolutional neural networks (CNNs)-based image recognition. This is mainly due to the strong ability of such networks in mining discriminative object pose and parts information from texture and shape. This is often inappropriate for fine-grained visual classification (FGVC) since it exhibits high intra-class and low inter-class variances due to occlusions, deformation, illuminations, etc. Thus, an expressive feature representation describing global structural information is a key to characterize an object/ scene. To this end, we propose a method that effectively captures subtle changes by aggregating context-aware features from most relevant image-regions and their importance in discriminating fine-grained categories avoiding the bounding-box and/or distinguishable part annotations. Our approach is inspired by the recent advancement in self-attention and graph neural networks (GNNs) approaches to include a simple yet effective relation-aware feature transformation and its refinement using a context-aware attention mechanism to boost the discriminability of the transformed feature in an end-to-end learning process. Our model is evaluated on eight benchmark datasets consisting of fine-grained objects and human-object interactions. It outperforms the state-of-the-art approaches by a significant margin in recognition accuracy.Comment: Accepted manuscript - IEEE Transaction on Image Processin

    Bi-directional block self-attention for fast and memory-efficient sequence modeling

    Full text link
    Ā© Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved. Recurrent neural networks (RNN), convolutional neural networks (CNN) and self-attention networks (SAN) are commonly used to produce context-aware representations. RNN can capture long-range dependency but is hard to parallelize and not time-efficient. CNN focuses on local dependency but does not perform well on some tasks. SAN can model both such dependencies via highly parallelizable computation, but memory requirement grows rapidly in line with sequence length. In this paper, we propose a model, called ā€œbi-directional block self-attention network (Bi-BloSAN)ā€, for RNN/CNN-free sequence encoding. It requires as little memory as RNN but with all the merits of SAN. Bi-BloSAN splits the entire sequence into blocks, and applies an intra-block SAN to each block for modeling local context, then applies an inter-block SAN to the outputs for all blocks to capture long-range dependency. Thus, each SAN only needs to process a short sequence, and only a small amount of memory is required. Additionally, we use feature-level attention to handle the variation of contexts around the same word, and use forward/backward masks to encode temporal order information. On nine benchmark datasets for different NLP tasks, Bi-BloSAN achieves or improves upon state-of-the-art accuracy, and shows better efficiency-memory trade-off than existing RNN/CNN/SAN

    Situation aware intrusion recovery policy in WSNs

    Get PDF
    Wireless Sensor Networks (WSNs) have been gaining tremendous research attention the last few years as they support a broad range of applications in the context of the Internet of Things. WSN-driven applications greatly depend on the sensorsā€™ observations to support decision-making and respond accordingly to reported critical events. In case of compromisation, it is vital to recover compromised WSN services and continue to operate as expected. To achieve an effective restoration of compromised WSN services, sensors should be equipped with the logic to take recovery decisions and self-heal. Self-healing is challenging as sensors should be aware of a variety of aspects in order to take effective decisions and maximize the recovery benefits. So far situation awareness has not been actively investigated in an intrusion recovery context. This research work formulates situation aware intrusion recovery policy design guidelines in order to drive the design of new intrusion recovery solutions that are operated by an adaptable policy. An adaptable intrusion recovery policy is presented taking into consideration the proposed design guidelines. The evaluation results demonstrate that the proposed policy can address advanced attack strategies and aid the sensors to recover the networkā€™s operation under different attack situations and intrusion recovery requirements

    Self-Attention Network for Text Representation Learning

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.This research studies the effectiveness and efficiency of self-attention mechanisms for text representation learning in a deep learning-based natural language processing literature. We focus on developing novel self-attention networks to capture semantic and syntactic knowledge underlying natural language texts, and thus benefit a wide range of downstream natural language understanding tasks, which is followed by improving the networks with external relational, structured, factoid, and commonsense information in knowledge graphs. In the last decade, recurrent neural networks and convolutional neural networks are widely used to produce context-aware representations for natural language text: the former can capture long-range dependency but is hard to parallelize and not time-efficient; the latter focuses on local dependency but does not perform well on some tasks. Attention mechanisms, especially self-attention mechanisms, have recently attracted tremendous interest from both academia and industry, due to their light-weight structures, parallelizable computations, outstanding performance on a broad spectrum of natural language processing tasks. We first propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). Compared to previous works, the proposed attention mechanism is able to capture the subtle difference in contexts and thus alleviate the ambiguity or polysemy problem. Based solely on the proposed attention, we present a light-weight neural model, directional self-attention network, to learn both token- and sentence-level context-aware representations, with high efficiency and competitive performance. Furthermore, we improve the proposed network along with several directions: First, we extend the self-attention to a hierarchical structure to capture local and global dependencies for memory efficiency. Second, we introduce hard attention to the self-attention mechanism for mutual benefits of soft and hard attention mechanisms. Third, we capture both pairwise and global dependencies by a novel compatibility function composed of dot-product and additive attentions. Then, this research conducts extensive experiments on benchmark tasks to verify the effectiveness of the proposed self-attention networks from both quantitative and qualitative perspective. The benchmark tasks, including natural language inference, sentiment analysis, semantic role labeling, are able to comprehensively estimate models' capability of capturing both semantic and syntactic information underlying natural language texts. The empirical results show the proposed models empirically achieve state-of-the-art performance on a wide range of natural language understanding tasks, and are as fast and as memory-efficient as convolutional models. Lastly, although self-attention networks, even if those initialized by a pre-trained language model, learn powerful contextualized representations and achieve state-of-the- art performance, open questions still remain about what these models have learned and improvements can be made along with several directions. One such direction is when downstream task performance depends on relational knowledge - the kind stored in knowledge graphs. Therefore, we explore incorporation of self-attention networks and human-curated knowledge graphs because such knowledge can improve a self-attention network by either conducting symbolic reasoning over knowledge graphs to derive targeted results or embedding the relational information into neural networks to boost representation learning. We study several potential approaches in three knowledge graph-related scenarios in a natural language processing literature, i.e., knowledge-based question answering, knowledge base completion and commonsense reasoning. Experiments conducted on knowledge graph-related benchmarks show the effectiveness of our proposed models
    • ā€¦
    corecore