91,890 research outputs found

    An Attentive Survey of Attention Models

    Full text link
    Attention Model has now become an important concept in neural networks that has been researched within diverse application domains. This survey provides a structured and comprehensive overview of the developments in modeling attention. In particular, we propose a taxonomy which groups existing techniques into coherent categories. We review salient neural architectures in which attention has been incorporated, and discuss applications in which modeling attention has shown a significant impact. We also describe how attention has been used to improve the interpretability of neural networks. Finally, we discuss some future research directions in attention. We hope this survey will provide a succinct introduction to attention models and guide practitioners while developing approaches for their applications.Comment: accepted to Transactions on Intelligent Systems and Technology(TIST); 33 page

    A survey on Information Visualization in light of Vision and Cognitive sciences

    Full text link
    Information Visualization techniques are built on a context with many factors related to both vision and cognition, making it difficult to draw a clear picture of how data visually turns into comprehension. In the intent of promoting a better picture, here, we survey concepts on vision, cognition, and Information Visualization organized in a theorization named Visual Expression Process. Our theorization organizes the basis of visualization techniques with a reduced level of complexity; still, it is complete enough to foster discussions related to design and analytical tasks. Our work introduces the following contributions: (1) a Theoretical compilation of vision, cognition, and Information Visualization; (2) Discussions supported by vast literature; and (3) Reflections on visual-cognitive aspects concerning use and design. We expect our contributions will provide further clarification about how users and designers think about InfoVis, leveraging the potential of systems and techniques.Comment: 29 pages, Elsevier Journal preprin

    A Hierarchical Attention Model for Social Contextual Image Recommendation

    Full text link
    Image based social networks are among the most popular social networking services in recent years. With tremendous images uploaded everyday, understanding users' preferences on user-generated images and making recommendations have become an urgent need. In fact, many hybrid models have been proposed to fuse various kinds of side information~(e.g., image visual representation, social network) and user-item historical behavior for enhancing recommendation performance. However, due to the unique characteristics of the user generated images in social image platforms, the previous studies failed to capture the complex aspects that influence users' preferences in a unified framework. Moreover, most of these hybrid models relied on predefined weights in combining different kinds of information, which usually resulted in sub-optimal recommendation performance. To this end, in this paper, we develop a hierarchical attention model for social contextual image recommendation. In addition to basic latent user interest modeling in the popular matrix factorization based recommendation, we identify three key aspects (i.e., upload history, social influence, and owner admiration) that affect each user's latent preferences, where each aspect summarizes a contextual factor from the complex relationships between users and images. After that, we design a hierarchical attention network that naturally mirrors the hierarchical relationship (elements in each aspects level, and the aspect level) of users' latent interests with the identified key aspects. Specifically, by taking embeddings from state-of-the-art deep learning models that are tailored for each kind of data, the hierarchical attention network could learn to attend differently to more or less content. Finally, extensive experimental results on real-world datasets clearly show the superiority of our proposed model

    Text-based Question Answering from Information Retrieval and Deep Neural Network Perspectives: A Survey

    Full text link
    Text-based Question Answering (QA) is a challenging task which aims at finding short concrete answers for users' questions. This line of research has been widely studied with information retrieval techniques and has received increasing attention in recent years by considering deep neural network approaches. Deep learning approaches, which are the main focus of this paper, provide a powerful technique to learn multiple layers of representations and interaction between questions and texts. In this paper, we provide a comprehensive overview of different models proposed for the QA task, including both traditional information retrieval perspective, and more recent deep neural network perspective. We also introduce well-known datasets for the task and present available results from the literature to have a comparison between different techniques

    Improving Noise Robustness In Speaker Identification Using A Two-Stage Attention Model

    Full text link
    While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments. To improve robustness of speaker recognition system performance in noise, a novel two-stage attention mechanism which can be used in existing architectures such as Time Delay Neural Networks (TDNNs) and Convolutional Neural Networks (CNNs) is proposed. Noise is known to often mask important information in both time and frequency domain. The proposed mechanism allows the models to concentrate on reliable time/frequency components of the signal. The proposed approach is evaluated using the Voxceleb1 dataset, which aims at assessment of speaker recognition in real world situations. In addition three types of noise at different signal-noise-ratios (SNRs) were added for this work. The proposed mechanism is compared with three strong baselines: X-vectors, Attentive X-vector, and Resnet-34. Results on both identification and verification tasks show that the two-stage attention mechanism consistently improves upon these for all noise conditions.Comment: Submitted to Interspeech202

    Attentive Recurrent Comparators

    Full text link
    Rapid learning requires flexible representations to quickly adopt to new evidence. We develop a novel class of models called Attentive Recurrent Comparators (ARCs) that form representations of objects by cycling through them and making observations. Using the representations extracted by ARCs, we develop a way of approximating a \textit{dynamic representation space} and use it for one-shot learning. In the task of one-shot classification on the Omniglot dataset, we achieve the state of the art performance with an error rate of 1.5\%. This represents the first super-human result achieved for this task with a generic model that uses only pixel information

    "The Boating Store Had Its Best Sail Ever": Pronunciation-attentive Contextualized Pun Recognition

    Full text link
    Humor plays an important role in human languages and it is essential to model humor when building intelligence systems. Among different forms of humor, puns perform wordplay for humorous effects by employing words with double entendre and high phonetic similarity. However, identifying and modeling puns are challenging as puns usually involved implicit semantic or phonological tricks. In this paper, we propose Pronunciation-attentive Contextualized Pun Recognition (PCPR) to perceive human humor, detect if a sentence contains puns and locate them in the sentence. PCPR derives contextualized representation for each word in a sentence by capturing the association between the surrounding context and its corresponding phonetic symbols. Extensive experiments are conducted on two benchmark datasets. Results demonstrate that the proposed approach significantly outperforms the state-of-the-art methods in pun detection and location tasks. In-depth analyses verify the effectiveness and robustness of PCPR.Comment: 10 pages, 4 figures, 7 tables, accepted by ACL 202

    Exploring the Use of Attention within an Neural Machine Translation Decoder States to Translate Idioms

    Full text link
    Idioms pose problems to almost all Machine Translation systems. This type of language is very frequent in day-to-day language use and cannot be simply ignored. The recent interest in memory augmented models in the field of Language Modelling has aided the systems to achieve good results by bridging long-distance dependencies. In this paper we explore the use of such techniques into a Neural Machine Translation system to help in translation of idiomatic language

    Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning

    Full text link
    Mobility in an effective and socially-compliant manner is an essential yet challenging task for robots operating in crowded spaces. Recent works have shown the power of deep reinforcement learning techniques to learn socially cooperative policies. However, their cooperation ability deteriorates as the crowd grows since they typically relax the problem as a one-way Human-Robot interaction problem. In this work, we want to go beyond first-order Human-Robot interaction and more explicitly model Crowd-Robot Interaction (CRI). We propose to (i) rethink pairwise interactions with a self-attention mechanism, and (ii) jointly model Human-Robot as well as Human-Human interactions in the deep reinforcement learning framework. Our model captures the Human-Human interactions occurring in dense crowds that indirectly affects the robot's anticipation capability. Our proposed attentive pooling mechanism learns the collective importance of neighboring humans with respect to their future states. Various experiments demonstrate that our model can anticipate human dynamics and navigate in crowds with time efficiency, outperforming state-of-the-art methods.Comment: Accepted at ICRA2019. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Point-of-Interest Recommendation: Exploiting Self-Attentive Autoencoders with Neighbor-Aware Influence

    Full text link
    The rapid growth of Location-based Social Networks (LBSNs) provides a great opportunity to satisfy the strong demand for personalized Point-of-Interest (POI) recommendation services. However, with the tremendous increase of users and POIs, POI recommender systems still face several challenging problems: (1) the hardness of modeling non-linear user-POI interactions from implicit feedback; (2) the difficulty of incorporating context information such as POIs' geographical coordinates. To cope with these challenges, we propose a novel autoencoder-based model to learn the non-linear user-POI relations, namely \textit{SAE-NAD}, which consists of a self-attentive encoder (SAE) and a neighbor-aware decoder (NAD). In particular, unlike previous works equally treat users' checked-in POIs, our self-attentive encoder adaptively differentiates the user preference degrees in multiple aspects, by adopting a multi-dimensional attention mechanism. To incorporate the geographical context information, we propose a neighbor-aware decoder to make users' reachability higher on the similar and nearby neighbors of checked-in POIs, which is achieved by the inner product of POI embeddings together with the radial basis function (RBF) kernel. To evaluate the proposed model, we conduct extensive experiments on three real-world datasets with many state-of-the-art baseline methods and evaluation metrics. The experimental results demonstrate the effectiveness of our model.Comment: Accepted by the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018
    • …
    corecore