Search CORE

20 research outputs found

Ripple Down Rules for Question Answering

Author: Nguyen Dai Quoc
Nguyen Dat Quoc
Pham Son Bao
Publication venue: 'IOS Press'
Publication date: 04/11/2015
Field of study

Recent years have witnessed a new trend of building ontology-based question answering systems. These systems use semantic web information to produce more precise answers to users' queries. However, these systems are mostly designed for English. In this paper, we introduce an ontology-based question answering system named KbQAS which, to the best of our knowledge, is the first one made for Vietnamese. KbQAS employs our question analysis approach that systematically constructs a knowledge base of grammar rules to convert each input question into an intermediate representation element. KbQAS then takes the intermediate representation element with respect to a target ontology and applies concept-matching techniques to return an answer. On a wide range of Vietnamese questions, experimental results show that the performance of KbQAS is promising with accuracies of 84.1% and 82.4% for analyzing input questions and retrieving output answers, respectively. Furthermore, our question analysis approach can easily be applied to new domains and new languages, thus saving time and human effort.Comment: V1: 21 pages, 7 figures, 10 tables. V2: 8 figures, 10 tables; shorten section 2; change sections 4.3 and 5.1.2. V3: Accepted for publication in the Semantic Web journal. V4 (Author's manuscript): camera ready version, available from the Semantic Web journal at http://www.semantic-web-journal.ne

arXiv.org e-Print Archive

CiteSeerX

A portable natural language interface from Arabic to SQL.

Author: Al-Johar Badr
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/01/1999
Field of study

In recent years, natural language interface systems have been built based on the Front End and the Back End architecture which gives a guarantee of modularity and portability to the system as a whole. An Arabic Front End has been built that takes an input sentence, producing syntactic and semantic representations, which it maps into First Order Logic. Expressing the meaning of the user's question in terms of high level world concepts makes the natural language interface independent of the database structure. It is then easier to port the interface Front End to a database for a different domain. The syntactic treatments are based on Generalised Phrase Structure Grammar (GPSG) whereas the semantics are expressed in formal semantics theory. The focus is mainly to provide syntactic and semantic analyses for Arabic queries based on correct Arabic linguistic principles. The proposed treatments are proved and tested by building a prototype system. The prototype is implemented using one of the existing systems called Squirrel. An Arabic morphological analyser is also proposed and implemented to distinguish between two types of morphemes: internal morphemes which are a part of the word's pattern, and external morphemes which are independent words attached to the word but which are not part of the word's pattern. So, the system focuses on the extraction of morphemes from the various inflexions or forms of any Arabic word

White Rose E-theses Online

OpenGrey Repository

Natural language interface to relational database: a simplified customization approach

Author: Mvumbi Tresor
Publication venue: Department of Computer Science
Publication date: 01/01/2016
Field of study

Natural language interfaces to databases (NLIDB) allow end-users with no knowledge of a formal language like SQL to query databases. One of the main open problems currently investigated is the development of NLIDB systems that are easily portable across several domains. The present study focuses on the development and evaluation of methods allowing to simplify customization of NLIDB targeting relational databases without sacrificing coverage and accuracy. This goal is approached by the introduction of two authoring frameworks that aim to reduce the workload required to port a NLIDB to a new domain. The first authoring approach is called top-down; it assumes the existence of a corpus of unannotated natural language sample questions used to pre-harvest key lexical terms to simplify customization. The top-down approach further reduces the configuration workload by autoincluding the semantics for negative form of verbs, comparative and superlative forms of adjectives in the configuration model. The second authoring approach introduced is bottom-up; it explores the possibility of building a configuration model with no manual customization using the information from the database schema and an off-the-shelf dictionary. The evaluation of the prototype system with geo-query, a benchmark query corpus, has shown that the top-down approach significantly reduces the customization workload: 93% of the entries defining the meaning of verbs and adjectives which represents the hard work has been automatically generated by the system; only 26 straightforward mappings and 3 manual definitions of meaning were required for customization. The top-down approach answered correctly 74.5 % of the questions. The bottom-up approach, however, has correctly answered only 1/3 of the questions due to insufficient lexicon and missing semantics. The use of an external lexicon did not improve the system's accuracy. The bottom-up model has nevertheless correctly answered 3/4 of the 105 simple retrieval questions in the query corpus not requiring nesting. Therefore, the bottom-up approach can be useful to build an initial lightweight configuration model that can be incrementally refined by using the failed queries to train a topdown model for example. The experimental results for top-down suggest that it is indeed possible to construct a portable NLIDB that reduces the configuration effort while maintaining a decent coverage and accuracy

Cape Town University OpenUCT

Recommended from our members

View mappings for query languages

Author: Wong Kam Chooi
Publication venue
Publication date: 01/01/1984
Field of study

The problems of current use of query languages are looked at. One chief drawback is the undesirable requirement for end user familiarity with and knowledge of the underlying database structures, in order to retrieve data effectively. The approach adopted towards resolving this is by means of high-level view support, using unit view structures called perceived records. A prime concern of this thesis then, is the study of perceived record mappings from the database. A set of criteria for categorising and analysing the features of database mappings for end-user views is first developed. In addition, a classification of data structure transformations and data item transformations is also presented. The framework is general and is independent of a specific data model or database management system. Its usefulness is demonstrated by its application to the analysis of view transformations from recursive database structures to high-level, unit view structures. In addition, it serves as a basis for evaluating and comparing the mapping facilities in existing systems. Possible ways of specifying a suitable data model for the perceived record view concept are described. Following on, two general mapping techniques are discussed. This leads to a proposal for a mapping mechanism that supports the flexible derivation of complex perceived record views that can differ considerably from the source structures. The mechanism uses an intermediary canonical transform model. Description of how the transform model mechansim can be used in practical systems to derive perceived record views, is also presented. The feasibility of the ideas proposed are tested out by implementing an interactive software system for defining perceived record views. For this, a mapping definition language for perceived record derivation is first designed. The control system sets up the structures of the mapping definition language and prompts the End-User-Administrator to define and specify the mappings for a perceived record. Appraisals of both the proposed mapping mechanism and implementation are discussed. Examples of use of the interface system are included. The limitations of the implementation are pinpointed with suggestions for further improvements. Practical applications of the work and evaluation of the approach in the light of other existing approaches, are also discussed

Open Research Online (The Open University)

Using natural language processing for question answering in closed and open domains

Author: Latifi Majid
Publication venue: Universitat Politècnica de Catalunya
Publication date: 25/05/2018
Field of study

With regard to the growth in the amount of social, environmental, and biomedical information available digitally, there is a growing need for Question Answering (QA) systems that can empower users to master this new wealth of information. Despite recent progress in QA, the quality of interpretation and extraction of the desired answer is not adequate. We believe that striving for higher accuracy in QA systems is subject to on-going research, i.e., it is better to have no answer is better than wrong answers. However, there are diverse queries, which the state of the art QA systems cannot interpret and answer properly. The problem of interpreting a question in a way that could preserve its syntactic-semantic structure is considered as one of the most important challenges in this area. In this work we focus on the problems of semantic-based QA systems and analyzing the effectiveness of NLP techniques, query mapping, and answer inferencing both in closed (first scenario) and open (second scenario) domains. For this purpose, the architecture of Semantic-based closed and open domain Question Answering System (hereafter “ScoQAS”) over ontology resources is presented with two different prototyping: Ontology-based closed domain and an open domain under Linked Open Data (LOD) resource. The ScoQAS is based on NLP techniques combining semantic-based structure-feature patterns for question classification and creating a question syntactic-semantic information structure (QSiS). The QSiS provides an actual potential by building constraints to formulate the related terms on syntactic-semantic aspects and generating a question graph (QGraph) which facilitates making inference for getting a precise answer in the closed domain. In addition, our approach provides a convenient method to map the formulated comprehensive information into SPARQL query template to crawl in the LOD resources in the open domain. The main contributions of this dissertation are as follows: 1. Developing ScoQAS architecture integrated with common and specific components compatible with closed and open domain ontologies. 2. Analysing user’s question and building a question syntactic-semantic information structure (QSiS), which is constituted by several processes of the methodology: question classification, Expected Answer Type (EAT) determination, and generated constraints. 3. Presenting an empirical semantic-based structure-feature pattern for question classification and generalizing heuristic constraints to formulate the relations between the features in the recognized pattern in terms of syntactical and semantical. 4. Developing a syntactic-semantic QGraph for representing core components of the question. 5. Presenting an empirical graph-based answer inference in the closed domain. In a nutshell, a semantic-based QA system is presented which provides some experimental results over the closed and open domains. The efficiency of the ScoQAS is evaluated using measures such as precision, recall, and F-measure on LOD challenges in the open domain. We focus on quantitative evaluation in the closed domain scenario. Due to the lack of predefined benchmark(s) in the first scenario, we define measures that demonstrate the actual complexity of the problem and the actual efficiency of the solutions. The results of the analysis corroborate the performance and effectiveness of our approach to achieve a reasonable accuracy.Con respecto al crecimiento en la cantidad de información social, ambiental y biomédica disponible digitalmente, existe una creciente necesidad de sistemas de la búsqueda de la respuesta (QA) que puedan ofrecer a los usuarios la gestión de esta nueva cantidad de información. A pesar del progreso reciente en QA, la calidad de interpretación y extracción de la respuesta deseada no es la adecuada. Creemos que trabajar para lograr una mayor precisión en los sistemas de QA es todavía un campo de investigación abierto. Es decir, es mejor no tener respuestas que tener respuestas incorrectas. Sin embargo, existen diversas consultas que los sistemas de QA en el estado del arte no pueden interpretar ni responder adecuadamente. El problema de interpretar una pregunta de una manera que podría preservar su estructura sintáctica-semántica es considerado como uno de los desafíos más importantes en esta área. En este trabajo nos centramos en los problemas de los sistemas de QA basados en semántica y en el análisis de la efectividad de las técnicas de PNL, y la aplicación de consultas e inferencia respuesta tanto en dominios cerrados (primer escenario) como abiertos (segundo escenario). Para este propósito, la arquitectura del sistema de búsqueda de respuestas en dominios cerrados y abiertos basado en semántica (en adelante "ScoQAS") sobre ontologías se presenta con dos prototipos diferentes: en dominio cerrado basado en el uso de ontologías y un dominio abierto dirigido a repositorios de Linked Open Data (LOD). El ScoQAS se basa en técnicas de PNL que combinan patrones de características de estructura semánticas para la clasificación de preguntas y la creación de una estructura de información sintáctico-semántica de preguntas (QSiS). El QSiS proporciona una manera la construcción de restricciones para formular los términos relacionados en aspectos sintáctico-semánticos y generar un grafo de preguntas (QGraph) el cual facilita derivar inferencias para obtener una respuesta precisa en el dominio cerrado. Además, nuestro enfoque proporciona un método adecuado para aplicar la información integral formulada en la plantilla de consulta SPARQL para navegar en los recursos LOD en el dominio abierto. Las principales contribuciones de este trabajo son los siguientes: 1. El desarrollo de la arquitectura ScoQAS integrada con componentes comunes y específicos compatibles con ontologías de dominio cerrado y abierto. 2. El análisis de la pregunta del usuario y la construcción de una estructura de información sintáctico-semántica de las preguntas (QSiS), que está constituida por varios procesos de la metodología: clasificación de preguntas, determinación del Tipo de Respuesta Esperada (EAT) y las restricciones generadas. 3. La presentación de un patrón empírico basado en la estructura semántica para clasificar las preguntas y generalizar las restricciones heurísticas para formular las relaciones entre las características en el patrón reconocido en términos sintácticos y semánticos. 4. El desarrollo de un QGraph sintáctico-semántico para representar los componentes centrales de la pregunta. 5. La presentación de la respuesta inferida a partir de un grafo empírico en el dominio cerrado. En pocas palabras, se presenta un sistema semántico de QA que proporciona algunos resultados experimentales sobre los dominios cerrados y abiertos. La eficiencia del ScoQAS se evalúa utilizando medidas tales como una precisión, cobertura y la medida-F en desafíos LOD para el dominio abierto. Para el dominio cerrado, nos centramos en la evaluación cuantitativa; su precisión se analiza en una ontología empresarial. La falta de un banco la pruebas predefinidas es uno de los principales desafíos de la evaluación en el primer escenario. Por lo tanto, definimos medidas que demuestran la complejidad real del problema y la eficiencia real de las soluciones. Los resultados del análisis corroboran el rendimient

Tesis Doctorals en Xarxa

Extending a set-theoretic implementation of Montague Semantics to accommodate n-ary transitive verbs.

Author: Roy Maxim
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2005
Field of study

Natural-language querying of databases remains an important and challenging area. Many approaches have been proposed over many years yet none of them has provided a comprehensive fully-compositional denotational semantics for a large sub-set of natural language, even for querying first-order non-intentional, non-modal, relational databases. One approach, which has made significant progress, is that which is based on Montague Semantics. Various researchers have helped to develop this approach and have demonstrated its viability. However, none have yet shown how to accommodate transitive verbs of arity greater than two. Our thesis is that existing approaches to the implementation of Montague Semantics in modern functional programming languages can be extended to solve this problem. This thesis is proven through the development of a compositional semantics for n-ary transitive verbs (n ≥ 2) and implementation in the Miranda programming environment. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .R69. Source: Masters Abstracts International, Volume: 44-03, page: 1413. Thesis (M.Sc.)--University of Windsor (Canada), 2005

Scholarship at UWindsor

QUERY LANGUAGES - A TAXONOMY

Author: Jarke Matthias
Vassiliou Yannis
Publication venue: Stern School of Business, New York University
Publication date: 01/01/1982
Field of study

Information Systems Working Papers Serie

New York University Faculty Digital Archive