13,058 research outputs found

    Improved Neural Relation Detection for Knowledge Base Question Answering

    Full text link
    Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning that detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different hierarchies of abstraction. Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation detector to enable one enhance another. Experimental results evidence that our approach achieves not only outstanding relation detection performance, but more importantly, it helps our KBQA system to achieve state-of-the-art accuracy for both single-relation (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.Comment: Accepted by ACL 2017 (updated for camera-ready

    An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering

    Full text link
    Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201

    Using Context Information to Enhance Simple Question Answering

    Full text link
    With the rapid development of knowledge bases(KBs),question answering(QA)based on KBs has become a hot research issue. In this paper,we propose two frameworks(i.e.,pipeline framework,an end-to-end framework)to focus answering single-relation factoid question. In both of two frameworks,we study the effect of context information on the quality of QA,such as the entity's notable type,out-degree. In the end-to-end framework,we combine char-level encoding and self-attention mechanisms,using weight sharing and multi-task strategies to enhance the accuracy of QA. Experimental results show that context information can get better results of simple QA whether it is the pipeline framework or the end-to-end framework. In addition,we find that the end-to-end framework achieves results competitive with state-of-the-art approaches in terms of accuracy and take much shorter time than them.Comment: under review World Wide Web Journa

    Machine Learning with World Knowledge: The Position and Survey

    Full text link
    Machine learning has become pervasive in multiple domains, impacting a wide variety of applications, such as knowledge discovery and data mining, natural language processing, information retrieval, computer vision, social and health informatics, ubiquitous computing, etc. Two essential problems of machine learning are how to generate features and how to acquire labels for machines to learn. Particularly, labeling large amount of data for each domain-specific problem can be very time consuming and costly. It has become a key obstacle in making learning protocols realistic in applications. In this paper, we will discuss how to use the existing general-purpose world knowledge to enhance machine learning processes, by enriching the features or reducing the labeling work. We start from the comparison of world knowledge with domain-specific knowledge, and then introduce three key problems in using world knowledge in learning processes, i.e., explicit and implicit feature representation, inference for knowledge linking and disambiguation, and learning with direct or indirect supervision. Finally we discuss the future directions of this research topic

    A Restricted Visual Turing Test for Deep Scene and Event Understanding

    Full text link
    This paper presents a restricted visual Turing test (VTT) for story-line based deep understanding in long-term and multi-camera captured videos. Given a set of videos of a scene (such as a multi-room office, a garden, and a parking lot.) and a sequence of story-line based queries, the task is to provide answers either simply in binary form "true/false" (to a polar query) or in an accurate natural language description (to a non-polar query). Queries, polar or non-polar, consist of view-based queries which can be answered from a particular camera view and scene-centered queries which involves joint inference across different cameras. The story lines are collected to cover spatial, temporal and causal understanding of input videos. The data and queries distinguish our VTT from recently proposed visual question answering in images and video captioning. A vision system is proposed to perform joint video and query parsing which integrates different vision modules, a knowledge base and a query engine. The system provides unified interfaces for different modules so that individual modules can be reconfigured to test a new method. We provide a benchmark dataset and a toolkit for ontology guided story-line query generation which consists of about 93.5 hours videos captured in four different locations and 3,426 queries split into 127 story lines. We also provide a baseline implementation and result analyses

    Joint Video and Text Parsing for Understanding Events and Answering Queries

    Full text link
    We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

    An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities

    Full text link
    Named entities are usually composable and extensible. Typical examples are names of symptoms and diseases in medical areas. To distinguish these entities from general entities, we name them \textit{compound entities}. In this paper, we present an attention-based Bi-GRU-CapsNet model to detect hypernymy relationship between compound entities. Our model consists of several important components. To avoid the out-of-vocabulary problem, English words or Chinese characters in compound entities are fed into the bidirectional gated recurrent units. An attention mechanism is designed to focus on the differences between the two compound entities. Since there are some different cases in hypernymy relationship between compound entities, capsule network is finally employed to decide whether the hypernymy relationship exists or not. Experimental results demonstrateComment: 5 pages, 3 figures. Accepted as short paper by 2018 International Conference on Bioinformatics and Biomedicin

    Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

    No full text
    Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines

    Question Answering with Subgraph Embeddings

    Full text link
    This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers. Training our system using pairs of questions and structured representations of their answers, and pairs of question paraphrases, yields competitive results on a competitive benchmark of the literature

    Learning Visual Knowledge Memory Networks for Visual Question Answering

    Full text link
    Visual question answering (VQA) requires joint comprehension of images and natural language questions, where many questions can't be directly or clearly answered from visual content but require reasoning from structured human knowledge with confirmation from visual content. This paper proposes visual knowledge memory network (VKMN) to address this issue, which seamlessly incorporates structured human knowledge and deep visual features into memory networks in an end-to-end learning framework. Comparing to existing methods for leveraging external knowledge for supporting VQA, this paper stresses more on two missing mechanisms. First is the mechanism for integrating visual contents with knowledge facts. VKMN handles this issue by embedding knowledge triples (subject, relation, target) and deep visual features jointly into the visual knowledge features. Second is the mechanism for handling multiple knowledge facts expanding from question and answer pairs. VKMN stores joint embedding using key-value pair structure in the memory networks so that it is easy to handle multiple facts. Experiments show that the proposed method achieves promising results on both VQA v1.0 and v2.0 benchmarks, while outperforms state-of-the-art methods on the knowledge-reasoning related questions.Comment: Supplementary to CVPR 2018 versio
    corecore