23 research outputs found
A portable natural language interface from Arabic to SQL.
In recent years, natural language interface systems have been built based on
the Front End and the Back End architecture which gives a guarantee of modularity
and portability to the system as a whole. An Arabic Front End has
been built that takes an input sentence, producing syntactic and semantic representations,
which it maps into First Order Logic. Expressing the meaning
of the user's question in terms of high level world concepts makes the natural
language interface independent of the database structure. It is then easier to
port the interface Front End to a database for a different domain.
The syntactic treatments are based on Generalised Phrase Structure Grammar
(GPSG) whereas the semantics are expressed in formal semantics theory. The
focus is mainly to provide syntactic and semantic analyses for Arabic queries
based on correct Arabic linguistic principles. The proposed treatments are
proved and tested by building a prototype system. The prototype is implemented
using one of the existing systems called Squirrel.
An Arabic morphological analyser is also proposed and implemented to distinguish
between two types of morphemes: internal morphemes which are a part
of the word's pattern, and external morphemes which are independent words
attached to the word but which are not part of the word's pattern. So, the
system focuses on the extraction of morphemes from the various inflexions or
forms of any Arabic word
Ripple Down Rules for Question Answering
Recent years have witnessed a new trend of building ontology-based question
answering systems. These systems use semantic web information to produce more
precise answers to users' queries. However, these systems are mostly designed
for English. In this paper, we introduce an ontology-based question answering
system named KbQAS which, to the best of our knowledge, is the first one made
for Vietnamese. KbQAS employs our question analysis approach that
systematically constructs a knowledge base of grammar rules to convert each
input question into an intermediate representation element. KbQAS then takes
the intermediate representation element with respect to a target ontology and
applies concept-matching techniques to return an answer. On a wide range of
Vietnamese questions, experimental results show that the performance of KbQAS
is promising with accuracies of 84.1% and 82.4% for analyzing input questions
and retrieving output answers, respectively. Furthermore, our question analysis
approach can easily be applied to new domains and new languages, thus saving
time and human effort.Comment: V1: 21 pages, 7 figures, 10 tables. V2: 8 figures, 10 tables; shorten
section 2; change sections 4.3 and 5.1.2. V3: Accepted for publication in the
Semantic Web journal. V4 (Author's manuscript): camera ready version,
available from the Semantic Web journal at
http://www.semantic-web-journal.ne
Natural language interface to relational database: a simplified customization approach
Natural language interfaces to databases (NLIDB) allow end-users with no knowledge of a formal language like SQL to query databases. One of the main open problems currently investigated is the development of NLIDB systems that are easily portable across several domains. The present study focuses on the development and evaluation of methods allowing to simplify customization of NLIDB targeting relational databases without sacrificing coverage and accuracy. This goal is approached by the introduction of two authoring frameworks that aim to reduce the workload required to port a NLIDB to a new domain. The first authoring approach is called top-down; it assumes the existence of a corpus of unannotated natural language sample questions used to pre-harvest key lexical terms to simplify customization. The top-down approach further reduces the configuration workload by autoincluding the semantics for negative form of verbs, comparative and superlative forms of adjectives in the configuration model. The second authoring approach introduced is bottom-up; it explores the possibility of building a configuration model with no manual customization using the information from the database schema and an off-the-shelf dictionary. The evaluation of the prototype system with geo-query, a benchmark query corpus, has shown that the top-down approach significantly reduces the customization workload: 93% of the entries defining the meaning of verbs and adjectives which represents the hard work has been automatically generated by the system; only 26 straightforward mappings and 3 manual definitions of meaning were required for customization. The top-down approach answered correctly 74.5 % of the questions. The bottom-up approach, however, has correctly answered only 1/3 of the questions due to insufficient lexicon and missing semantics. The use of an external lexicon did not improve the system's accuracy. The bottom-up model has nevertheless correctly answered 3/4 of the 105 simple retrieval questions in the query corpus not requiring nesting. Therefore, the bottom-up approach can be useful to build an initial lightweight configuration model that can be incrementally refined by using the failed queries to train a topdown model for example. The experimental results for top-down suggest that it is indeed possible to construct a portable NLIDB that reduces the configuration effort while maintaining a decent coverage and accuracy
An overview of computer-based natural language processing
Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds
Recommended from our members
View mappings for query languages
The problems of current use of query languages are looked at. One chief drawback is the undesirable requirement for end user familiarity with and knowledge of the underlying database structures, in order to retrieve data effectively. The approach adopted towards resolving this is by means of high-level view support, using unit view structures called perceived records. A prime concern of this thesis then, is the study of perceived record mappings from the database.
A set of criteria for categorising and analysing the features of database mappings for end-user views is first developed. In addition, a classification of data structure transformations and data item transformations is also presented. The framework is general and is independent of a specific data model or database management system. Its usefulness is demonstrated by its application to the analysis of view transformations from recursive database structures to high-level, unit view structures. In addition, it serves as a basis for evaluating and comparing the mapping facilities in existing systems.
Possible ways of specifying a suitable data model for the perceived record view concept are described. Following on, two general mapping techniques are discussed. This leads to a proposal for a mapping mechanism that supports the flexible derivation of complex perceived record views that can differ considerably from the source structures. The mechanism uses an intermediary canonical transform model. Description of how the transform model mechansim can be used in practical systems to derive perceived record views, is also presented.
The feasibility of the ideas proposed are tested out by implementing an interactive software system for defining perceived record views. For this, a mapping definition language for perceived record derivation is first designed. The control system sets up the structures of the mapping definition language and prompts the End-User-Administrator to define and specify the mappings for a perceived record. Appraisals of both the proposed mapping mechanism and implementation are discussed. Examples of use of the interface system are included. The limitations of the implementation are pinpointed with suggestions for further improvements. Practical applications of the work and evaluation of the approach in the light of other existing approaches, are also discussed