73,865 research outputs found

    Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures

    Get PDF
    Hakimov S. Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures. Bielefeld: Universität Bielefeld; 2019.The task of answering natural language questions over structured data has received wide interest in recent years. Structured data in the form of knowledge bases has been available for public usage with coverage on multiple domains. DBpedia and Freebase are such knowledge bases that include encyclopedic data about multiple domains. However, querying such knowledge bases requires an understanding of a query language and the underlying ontology, which requires domain expertise. Querying structured data via question answering systems that understand natural language has gained popularity to bridge the gap between the data and the end user. In order to understand a natural language question, a question answering system needs to map the question into query representation that can be evaluated given a knowledge base. An important aspect that we focus in this thesis is the multilinguality. While most research focused on building monolingual solutions, mainly English, this thesis focuses on building multilingual question answering systems. The main challenge for processing language input is interpreting the meaning of questions in multiple languages. In this thesis, we present three different semantic parsing approaches that learn models to map questions into meaning representations, into a query in particular, in a supervised fashion. Each approach differs in the way the model is learned, the features of the model, the way of representing the meaning and how the meaning of questions is composed. The first approach learns a joint probabilistic model for syntax and semantics simultaneously from the labeled data. The second method learns a factorized probabilistic graphical model that builds on a dependency parse of the input question and predicts the meaning representation that is converted into a query. The last approach presents a number of different neural architectures that tackle the task of question answering in end-to-end fashion. We evaluate each approach using publicly available datasets and compare them with state-of-the-art QA systems

    Visual Question Answering: A Survey of Methods and Datasets

    Full text link
    Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities. Given an image and a question in natural language, it requires reasoning over visual elements of the image and general knowledge to infer the correct answer. In the first part of this survey, we examine the state of the art by comparing modern approaches to the problem. We classify methods by their mechanism to connect the visual and textual modalities. In particular, we examine the common approach of combining convolutional and recurrent neural networks to map images and questions to a common feature space. We also discuss memory-augmented and modular architectures that interface with structured knowledge bases. In the second part of this survey, we review the datasets available for training and evaluating VQA systems. The various datatsets contain questions at different levels of complexity, which require different capabilities and types of reasoning. We examine in depth the question/answer pairs from the Visual Genome project, and evaluate the relevance of the structured annotations of images with scene graphs for VQA. Finally, we discuss promising future directions for the field, in particular the connection to structured knowledge bases and the use of natural language processing models.Comment: 25 page

    FVQA: Fact-based Visual Question Answering

    Full text link
    Visual Question Answering (VQA) has attracted a lot of attention in both Computer Vision and Natural Language Processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of such questions that require no external information to answer is interesting, but very limited. It excludes questions which require common sense, or basic factual knowledge to answer, for example. Here we introduce FVQA, a VQA dataset which requires, and supports, much deeper reasoning. FVQA only contains questions which require external information to answer. We thus extend a conventional visual question answering dataset, which contains image-question-answerg triplets, through additional image-question-answer-supporting fact tuples. The supporting fact is represented as a structural triplet, such as . We evaluate several baseline models on the FVQA dataset, and describe a novel model which is capable of reasoning about an image on the basis of supporting facts.Comment: 16 page
    • …
    corecore