23 research outputs found

    A portable natural language interface from Arabic to SQL.

    Get PDF
    In recent years, natural language interface systems have been built based on the Front End and the Back End architecture which gives a guarantee of modularity and portability to the system as a whole. An Arabic Front End has been built that takes an input sentence, producing syntactic and semantic representations, which it maps into First Order Logic. Expressing the meaning of the user's question in terms of high level world concepts makes the natural language interface independent of the database structure. It is then easier to port the interface Front End to a database for a different domain. The syntactic treatments are based on Generalised Phrase Structure Grammar (GPSG) whereas the semantics are expressed in formal semantics theory. The focus is mainly to provide syntactic and semantic analyses for Arabic queries based on correct Arabic linguistic principles. The proposed treatments are proved and tested by building a prototype system. The prototype is implemented using one of the existing systems called Squirrel. An Arabic morphological analyser is also proposed and implemented to distinguish between two types of morphemes: internal morphemes which are a part of the word's pattern, and external morphemes which are independent words attached to the word but which are not part of the word's pattern. So, the system focuses on the extraction of morphemes from the various inflexions or forms of any Arabic word

    Ripple Down Rules for Question Answering

    Full text link
    Recent years have witnessed a new trend of building ontology-based question answering systems. These systems use semantic web information to produce more precise answers to users' queries. However, these systems are mostly designed for English. In this paper, we introduce an ontology-based question answering system named KbQAS which, to the best of our knowledge, is the first one made for Vietnamese. KbQAS employs our question analysis approach that systematically constructs a knowledge base of grammar rules to convert each input question into an intermediate representation element. KbQAS then takes the intermediate representation element with respect to a target ontology and applies concept-matching techniques to return an answer. On a wide range of Vietnamese questions, experimental results show that the performance of KbQAS is promising with accuracies of 84.1% and 82.4% for analyzing input questions and retrieving output answers, respectively. Furthermore, our question analysis approach can easily be applied to new domains and new languages, thus saving time and human effort.Comment: V1: 21 pages, 7 figures, 10 tables. V2: 8 figures, 10 tables; shorten section 2; change sections 4.3 and 5.1.2. V3: Accepted for publication in the Semantic Web journal. V4 (Author's manuscript): camera ready version, available from the Semantic Web journal at http://www.semantic-web-journal.ne

    Natural language interface to relational database: a simplified customization approach

    Get PDF
    Natural language interfaces to databases (NLIDB) allow end-users with no knowledge of a formal language like SQL to query databases. One of the main open problems currently investigated is the development of NLIDB systems that are easily portable across several domains. The present study focuses on the development and evaluation of methods allowing to simplify customization of NLIDB targeting relational databases without sacrificing coverage and accuracy. This goal is approached by the introduction of two authoring frameworks that aim to reduce the workload required to port a NLIDB to a new domain. The first authoring approach is called top-down; it assumes the existence of a corpus of unannotated natural language sample questions used to pre-harvest key lexical terms to simplify customization. The top-down approach further reduces the configuration workload by autoincluding the semantics for negative form of verbs, comparative and superlative forms of adjectives in the configuration model. The second authoring approach introduced is bottom-up; it explores the possibility of building a configuration model with no manual customization using the information from the database schema and an off-the-shelf dictionary. The evaluation of the prototype system with geo-query, a benchmark query corpus, has shown that the top-down approach significantly reduces the customization workload: 93% of the entries defining the meaning of verbs and adjectives which represents the hard work has been automatically generated by the system; only 26 straightforward mappings and 3 manual definitions of meaning were required for customization. The top-down approach answered correctly 74.5 % of the questions. The bottom-up approach, however, has correctly answered only 1/3 of the questions due to insufficient lexicon and missing semantics. The use of an external lexicon did not improve the system's accuracy. The bottom-up model has nevertheless correctly answered 3/4 of the 105 simple retrieval questions in the query corpus not requiring nesting. Therefore, the bottom-up approach can be useful to build an initial lightweight configuration model that can be incrementally refined by using the failed queries to train a topdown model for example. The experimental results for top-down suggest that it is indeed possible to construct a portable NLIDB that reduces the configuration effort while maintaining a decent coverage and accuracy

    An overview of computer-based natural language processing

    Get PDF
    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds
    corecore