8,939 research outputs found

    Systematic Literature Review on Ontology-based Indonesian Question Answering System

    Get PDF
    Question-Answering (QA) systems at the intersection of natural language processing, information retrieval, and knowledge representation aim to provide efficient responses to natural language queries. These systems have seen extensive development in English and languages like Indonesian present unique challenges and opportunities. This literature review paper delves into the state of ontology-based Indonesian QA systems, highlighting critical challenges. The first challenge lies in sentence understanding, variations, and complexity. Most systems rely on syntactic analysis and struggle to grasp sentence semantics. Complex sentences, especially in Indonesian, pose difficulties in parsing, semantic interpretation, and knowledge extraction. Addressing these linguistic intricacies is pivotal for accurate responses. Secondly, template-based SPARQL query construction, commonly used in Indonesian QA systems, suffers from semantic gaps and inflexibility. Advanced techniques like semantic matching algorithms and dynamic template generation can bridge these gaps and adapt to evolving ontologies. Thirdly, lexical gaps and ambiguity hinder QA systems. Bridging vocabulary mismatches between user queries and ontology labels remains a challenge. Strategies like synonym expansion, word embedding, and ontology enrichment must be explored further to overcome these challenges. Lastly, the review discusses the potential of developing multi-domain ontologies to broaden the knowledge coverage of QA systems. While this presents complex linguistic and ontological challenges, it offers the advantage of responding to various user queries across various domains. This literature review identifies crucial challenges in developing ontology-based Indonesian QA systems and suggests innovative approaches to address these challenges

    Using Linguistic Analysis to Translate Arabic Natural Language Queries to SPARQL

    Full text link
    The logic-based machine-understandable framework of the Semantic Web often challenges naive users when they try to query ontology-based knowledge bases. Existing research efforts have approached this problem by introducing Natural Language (NL) interfaces to ontologies. These NL interfaces have the ability to construct SPARQL queries based on NL user queries. However, most efforts were restricted to queries expressed in English, and they often benefited from the advancement of English NLP tools. However, little research has been done to support querying the Arabic content on the Semantic Web by using NL queries. This paper presents a domain-independent approach to translate Arabic NL queries to SPARQL by leveraging linguistic analysis. Based on a special consideration on Noun Phrases (NPs), our approach uses a language parser to extract NPs and the relations from Arabic parse trees and match them to the underlying ontology. It then utilizes knowledge in the ontology to group NPs into triple-based representations. A SPARQL query is finally generated by extracting targets and modifiers, and interpreting them into SPARQL. The interpretation of advanced semantic features including negation, conjunctive and disjunctive modifiers is also supported. The approach was evaluated by using two datasets consisting of OWL test data and queries, and the obtained results have confirmed its feasibility to translate Arabic NL queries to SPARQL.Comment: Journal Pape

    Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

    Full text link
    Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770

    Knowledge Rich Natural Language Queries over Structured Biological Databases

    Full text link
    Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made
    • …
    corecore