197 research outputs found

    Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge

    Full text link
    Traditional semantic parsers map language onto compositional, executable queries in a fixed schema. This mapping allows them to effectively leverage the information contained in large, formal knowledge bases (KBs, e.g., Freebase) to answer questions, but it is also fundamentally limiting---these semantic parsers can only assign meaning to language that falls within the KB's manually-produced schema. Recently proposed methods for open vocabulary semantic parsing overcome this limitation by learning execution models for arbitrary language, essentially using a text corpus as a kind of knowledge base. However, all prior approaches to open vocabulary semantic parsing replace a formal KB with textual information, making no use of the KB in their models. We show how to combine the disparate representations used by these two approaches, presenting for the first time a semantic parser that (1) produces compositional, executable representations of language, (2) can successfully leverage the information contained in both a formal KB and a large corpus, and (3) is not limited to the schema of the underlying KB. We demonstrate significantly improved performance over state-of-the-art baselines on an open-domain natural language question answering task.Comment: Re-written abstract and intro, other minor changes throughout. This version published at AAAI 201

    Learning structured natural language representations for semantic parsing

    Get PDF
    We introduce a neural semantic parser that converts natural language utterances to intermediate representations in the form of predicate-argument structures, which are induced with a transition system and subsequently mapped to target domains. The semantic parser is trained end-to-end using annotated logical forms or their denotations. We obtain competitive results on various datasets. The induced predicate-argument structures shed light on the types of representations useful for semantic parsing and how these are different from linguistically motivated ones

    KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base

    Full text link
    Complex question answering over knowledge base (Complex KBQA) is challenging because it requires various compositional reasoning capabilities, such as multi-hop inference, attribute comparison, set operation, and etc. Existing benchmarks have some shortcomings that limit the development of Complex KBQA: 1) they only provide QA pairs without explicit reasoning processes; 2) questions are either generated by templates, leading to poor diversity, or on a small scale. To this end, we introduce KQA Pro, a large-scale dataset for Complex KBQA. We define a compositional and highly-interpretable formal format, named Program, to represent the reasoning process of complex questions. We propose compositional strategies to generate questions, corresponding SPARQLs, and Programs with a small number of templates, and then paraphrase the generated questions to natural language questions (NLQ) by crowdsourcing, giving rise to around 120K diverse instances. SPARQL and Program depict two complementary solutions to answer complex questions, which can benefit a large spectrum of QA methods. Besides the QA task, KQA Pro can also serves for the semantic parsing task. As far as we know, it is currently the largest corpus of NLQ-to-SPARQL and NLQ-to-Program. We conduct extensive experiments to evaluate whether machines can learn to answer our complex questions in different cases, that is, with only QA supervision or with intermediate SPARQL/Program supervision. We find that state-of-the-art KBQA methods learnt from only QA pairs perform very poor on our dataset, implying our questions are more challenging than previous datasets. However, pretrained models learnt from our NLQ-to-SPARQL and NLQ-to-Program annotations surprisingly achieve about 90\% answering accuracy, which is even close to the human expert performance..

    Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures

    Get PDF
    Hakimov S. Learning Multilingual Semantic Parsers for Question Answering over Linked Data. A comparison of neural and probabilistic graphical model architectures. Bielefeld: Universität Bielefeld; 2019.The task of answering natural language questions over structured data has received wide interest in recent years. Structured data in the form of knowledge bases has been available for public usage with coverage on multiple domains. DBpedia and Freebase are such knowledge bases that include encyclopedic data about multiple domains. However, querying such knowledge bases requires an understanding of a query language and the underlying ontology, which requires domain expertise. Querying structured data via question answering systems that understand natural language has gained popularity to bridge the gap between the data and the end user. In order to understand a natural language question, a question answering system needs to map the question into query representation that can be evaluated given a knowledge base. An important aspect that we focus in this thesis is the multilinguality. While most research focused on building monolingual solutions, mainly English, this thesis focuses on building multilingual question answering systems. The main challenge for processing language input is interpreting the meaning of questions in multiple languages. In this thesis, we present three different semantic parsing approaches that learn models to map questions into meaning representations, into a query in particular, in a supervised fashion. Each approach differs in the way the model is learned, the features of the model, the way of representing the meaning and how the meaning of questions is composed. The first approach learns a joint probabilistic model for syntax and semantics simultaneously from the labeled data. The second method learns a factorized probabilistic graphical model that builds on a dependency parse of the input question and predicts the meaning representation that is converted into a query. The last approach presents a number of different neural architectures that tackle the task of question answering in end-to-end fashion. We evaluate each approach using publicly available datasets and compare them with state-of-the-art QA systems
    • …
    corecore