54,293 research outputs found
Improve and Implement an Open Source Question Answering System
A question answer system takes queries from the user in natural language and returns a short concise answer which best fits the response to the question. This report discusses the integration and implementation of question answer systems for English and Hindi as part of the open source search engine Yioop. We have implemented a question answer system for English and Hindi, keeping in mind users who use these languages as their primary language. The user should be able to query a set of documents and should get the answers in the same language. English and Hindi are very different when it comes to language structure, characters etc. We have implemented the Question Answer System so that it supports localization and improved Part of Speech tagging performance by storing the lexicon in the database instead of a file based lexicon. We have implemented a brill tagger variant for Part of Speech tagging of Hindi phrases and grammar rules for triplet extraction. We also improve Yioop’s lexical data handling support by allowing the user to add named entities. Our improvements to Yioop were then evaluated by comparing the retrieved answers against a dataset of answers known to be true. The test data for the question answering system included creating 2 indexes, 1 each for English and Hindi. These were created by configuring Yioop to crawl 200,000 wikipedia pages for each crawl. The crawls were configured to be domain specific so that English index consists of pages restricted to English text and Hindi index is restricted to pages with Hindi text. We then used a set of 50 questions on the English and Hindi systems. We recored, Hindi system to have an accuracy of about 55% for simple factoid questions and English question answer system to have an accuracy of 63%
Recommended from our members
AQUA: an ontology driven question answering system
This paper describes AQUA our question answering over the Web. AQUA was designed to work over heterogeneous sources. This means that AQUA is equipped to work as closed domain and in addition to open-domain question answering. As a first instance, AQUA tries to answer a question using a Knowledge base. If a query cannot be satisfied over a knowledge base/database. Then, AQUA tries to find an answer on web pages (i.e. it uses as corpus the internet as resource). Our system uses NLP (Natural Language Processing), First order logic and Information Extraction technologies. AQUA has been tested using an ontology which describes academic life. Keywords Ontologies, Information Extraction, Machine Learnin
Semantic keyword search for expert witness discovery
In the last few years, there has been an increase in the amount of information stored in semantically enriched knowledge bases, represented in RDF format. These improve the accuracy of search results when the queries are semantically formal. However framing such queries is inappropriate for inexperience users because they require specialist knowledge of ontology and syntax. In this paper, we explore an approach that automates the process of converting a conventional keyword search into a semantically formal query in order to find an expert on a semantically enriched knowledge base. A case study on expert witness discovery for the resolution of a legal dispute is chosen as the domain of interest and a system named SKengine is implemented to illustrate the approach. As well as providing an easy user interface, our experiment shows that SKengine can retrieve expert witness information with higher precision and higher recall, compared with the other system, with the same interface, implemented by a vector model approach
Semantic keyword search for expert witness discovery
In the last few years, there has been an increase in the amount of information stored in semantically enriched knowledge bases, represented in RDF format. These improve the accuracy of search results when the queries are semantically formal. However framing such queries is inappropriate for inexperience users because they require specialist knowledge of ontology and syntax. In this paper, we explore an approach that automates the process of converting a conventional keyword search into a semantically formal query in order to find an expert on a semantically enriched knowledge base. A case study on expert witness discovery for the resolution of a legal dispute is chosen as the domain of interest and a system named SKengine is implemented to illustrate the approach. As well as providing an easy user interface, our experiment shows that SKengine can retrieve expert witness information with higher precision and higher recall, compared with the other system, with the same interface, implemented by a vector model approach
Managing data through the lens of an ontology
Ontology-based data management aims at managing data through the lens of an ontology, that is, a conceptual representation of the domain of interest in the underlying information system. This new paradigm provides several interesting features, many of which have already been proved effective in managing complex information systems. This article introduces the notion of ontology-based data management, illustrating the main ideas underlying the paradigm, and pointing out the importance of knowledge representation and automated reasoning for addressing the technical challenges it introduces
Ontology-Based Data Access and Integration
An ontology-based data integration (OBDI) system is an information management system consisting of three components: an ontology, a set of data sources, and the mapping between the two. The ontology is a conceptual, formal description of the domain of interest to a given organization (or a community of users), expressed in terms of relevant concepts, attributes of concepts, relationships between concepts, and logical assertions characterizing the domain knowledge. The data sources are the repositories accessible by the organization where data concerning the domain are stored. In the general case, such repositories are numerous, heterogeneous, each one managed and maintained independently from the others. The mapping is a precise specification of the correspondence between the data contained in the data sources and the elements of the ontology. The main purpose of an OBDI system is to allow information consumers to query the data using the elements in the ontology as predicates.
In the special case where the organization manages a single data source, the term ontology-based data access (ODBA) system is used
- …