Search CORE

8 research outputs found

Question Answering System : A Review On Question Analysis, Document Processing, And Answer Extraction Techniques

Author: Azmi Mohd Sanusi
Suryana Nanna
Utomo Fandy Setyo
Publication venue: JATIT & Little Lion Scientific (LLS)
Publication date: 01/01/2017
Field of study

Question Answering System could automatically provide an answer to a question posed by human in natural languages. This system consists of question analysis, document processing, and answer extraction module. Question Analysis module has task to translate query into a form that can be processed by document processing module. Document processing is a technique for identifying candidate documents, containing answer relevant to the user query. Furthermore, answer extraction module receives the set of passages from document processing module, then determine the best answers to user. Challenge to optimize Question Answering framework is to increase the performance of all modules in the framework. The performance of all modules that has not been optimized has led to the less accurate answer from question answering systems. Based on this issues, the objective of this study is to review the current state of question analysis, document processing, and answer extraction techniques. Result from this study reveals the potential research issues, namely morphology analysis, question classification, and term weighting algorithm for question classification

Universiti Teknikal Malaysia Melaka (UTeM) Repository

Extending a set-theoretic implementation of Montague Semantics to accommodate n-ary transitive verbs.

Author: Roy Maxim
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2005
Field of study

Natural-language querying of databases remains an important and challenging area. Many approaches have been proposed over many years yet none of them has provided a comprehensive fully-compositional denotational semantics for a large sub-set of natural language, even for querying first-order non-intentional, non-modal, relational databases. One approach, which has made significant progress, is that which is based on Montague Semantics. Various researchers have helped to develop this approach and have demonstrated its viability. However, none have yet shown how to accommodate transitive verbs of arity greater than two. Our thesis is that existing approaches to the implementation of Montague Semantics in modern functional programming languages can be extended to solve this problem. This thesis is proven through the development of a compositional semantics for n-ary transitive verbs (n ≥ 2) and implementation in the Miranda programming environment. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .R69. Source: Masters Abstracts International, Volume: 44-03, page: 1413. Thesis (M.Sc.)--University of Windsor (Canada), 2005

Scholarship at UWindsor

Community based Question Answer Detection

Author: Muthmann Klemens
Publication venue
Publication date: 10/01/2014
Field of study

Each day, millions of people ask questions and search for answers on the World Wide Web. Due to this, the Internet has grown to a world wide database of questions and answers, accessible to almost everyone. Since this database is so huge, it is hard to find out whether a question has been answered or even asked before. As a consequence, users are asking the same questions again and again, producing a vicious circle of new content which hides the important information. One platform for questions and answers are Web forums, also known as discussion boards. They present discussions as item streams where each item contains the contribution of one author. These contributions contain questions and answers in human readable form. People use search engines to search for information on such platforms. However, current search engines are neither optimized to highlight individual questions and answers nor to show which questions are asked often and which ones are already answered. In order to close this gap, this thesis introduces the \\emph{Effingo} system. The Effingo system is intended to extract forums from around the Web and find question and answer items. It also needs to link equal questions and aggregate associated answers. That way it is possible to find out whether a question has been asked before and whether it has already been answered. Based on these information it is possible to derive the most urgent questions from the system, to determine which ones are new and which ones are discussed and answered frequently. As a result, users are prevented from creating useless discussions, thus reducing the server load and information overload for further searches. The first research area explored by this thesis is forum data extraction. The results from this area are intended be used to create a database of forum posts as large as possible. Furthermore, it uses question-answer detection in order to find out which forum items are questions and which ones are answers and, finally, topic detection to aggregate questions on the same topic as well as discover duplicate answers. These areas are either extended by Effingo, using forum specific features such as the user graph, forum item relations and forum link structure, or adapted as a means to cope with the specific problems created by user generated content. Such problems arise from poorly written and very short texts as well as from hidden or distributed information

Technische Universität Dresden: Qucosa

Robust Dialog Management Through A Context-centric Architecture

Author: Hung Victor C.
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2010
Field of study

This dissertation presents and evaluates a method of managing spoken dialog interactions with a robust attention to fulfilling the human user’s goals in the presence of speech recognition limitations. Assistive speech-based embodied conversation agents are computer-based entities that interact with humans to help accomplish a certain task or communicate information via spoken input and output. A challenging aspect of this task involves open dialog, where the user is free to converse in an unstructured manner. With this style of input, the machine’s ability to communicate may be hindered by poor reception of utterances, caused by a user’s inadequate command of a language and/or faults in the speech recognition facilities. Since a speech-based input is emphasized, this endeavor involves the fundamental issues associated with natural language processing, automatic speech recognition and dialog system design. Driven by ContextBased Reasoning, the presented dialog manager features a discourse model that implements mixed-initiative conversation with a focus on the user’s assistive needs. The discourse behavior must maintain a sense of generality, where the assistive nature of the system remains constant regardless of its knowledge corpus. The dialog manager was encapsulated into a speech-based embodied conversation agent platform for prototyping and testing purposes. A battery of user trials was performed on this agent to evaluate its performance as a robust, domain-independent, speech-based interaction entity capable of satisfying the needs of its users

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Knowledge mining over scientific literature and technical documentation

Author: Rinaldi Fabio
Publication venue
Publication date: 01/01/2009
Field of study

Abstract This dissertation focuses on the extraction of information implicitly encoded in domain descriptions (technical terminology and related items) and its usage within a restricted-domain question answering system (QA). Since different variants of the same term can be used to refer to the same domain entity, it is necessary to recognize all possible forms of a given term and structure them, so that they can be used in the question answering process. The knowledge about domain descriptions and their mutual relations is leveraged in an extension to an existing QA system, aimed at the technical maintenance manual of a well-known commercial aircraft. The original version of the QA system did not make use of domain descriptions, which are the novelty introduced by the present work. The explicit treatment of domain descriptions provided considerable gains in terms of efficiency, in particular in the process of analysis of the background document collection. Similar techniques were later applied to another domain (biomedical scientific literature), focusing in particular on protein- protein interactions. This dissertation describes in particular: (1) the extraction of domain specific lexical items which refer to entities of the domain; (2) the detection of relationships (like synonymy and hyponymy) among such items, and their organization into a conceptual structure; (3) their usage within a domain restricted question answering system, in order to facilitate the correct identification of relevant answers to a query; (4) the adaptation of the system to another domain, and extension of the basic hypothesis to tasks other than question answering. Zusammenfassung Das Thema dieser Dissertation ist die Extraktion von Information, welche implizit in technischen Terminologien und ähnlichen Ressourcen enthalten ist, sowie ihre Anwendung in einem Antwortextraktionssystem (AE). Da verschiedene Varianten desselben Terms verwendet werden können, um auf den gleichen Begriff zu verweisen, ist die Erkennung und Strukturierung aller möglichen Formen Voraussetzung für den Einsatz in einem AE-System. Die Kenntnisse über Terme und deren Relationen werden in einem AE System angewandt, welches auf dem Wartungshandbuch eines bekannten Verkehrsflugzeug fokussiert. Die ursprüngliche Version des Systems hatte keine explizite Behandlung von Terminologie. Die explizite Behandlung von Terminologie lieferte eine beachtliche Verbesserung der Effizienz des Systems, insbesondere was die Analyse der zugrundeliegenden Dokumentensammlung betrifft. Ähnliche Methodologien wurden später auf einer anderen Domäne angewandt (biomedizinische Literatur), mit einen besonderen Fokus auf Interaktionen zwischen Proteinen. Diese Dissertation beschreibt insbesondere: (1) die Extraktion der Terminologie (2) die Identifikation der Relationen zwischen Termen (wie z.B. Synonymie und Hyponymie) (3) deren Verwendung in einen AE System (4) die Portierung des Systems auf eine andere Domäne

ZORA

Answer extraction in technical domains

Author: Dowdall J
Fournier R
Hess M
Mollà Diego
Rinaldi Fabio
Schneider G
Schwitter R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

In recent years, the information overload caused by the new media has made the shortcomings of traditional Information Retrieval increasingly evident. Practical needs of industry, government organizations and individual users alike push the research community towards systems that can exactly pinpoint those parts of documents that contain the information requested, rather than return a set of relevant documents. Answer Extraction (AE) systems aim to satisfy this need. In this article we discuss the problems faced in AE and present one such system. It has been often observed that traditional Information Retrieval should rather be called “Document Retrieval”

Crossref

ZORA

Macquarie University ResearchOnline

NLP for Answer Extraction in Technical Domains

Author: Diego Molla
Fabio Rinaldi
James Dowdall
Michael Hess
Rolf Schwitter
Publication venue
Publication date: 01/01/2003
Field of study

In this paper we argue that question answering (QA) over technical domains is distinctly different from TREC-based QA or Web-based QA and it cannot benefit from data-intensive approaches

CiteSeerX

Macquarie University ResearchOnline

NLP for answer extraction in technical domains

Author: Dowdall J
Hess M
Mollà Aliod D
Rinaldi Fabio
Schwitter R
Publication venue
Publication date: 01/01/2003
Field of study

In this paper we argue that question answering (QA) over technical domains is distinctly different from TREC-based QA or Web-based QA and it cannot benefit from data-intensive approaches. Technical questions arise in situations where concrete problems require specific answers and explanations. Finding a justification of the answer in the context of the document is essential if we have to solve a real-world problem. We show that NLP techniques can be used successfully in technical domains for high-precision access to information stored in documents. We present Extr- Ans, an answer extraction system over technical domains, its architecture, its use of logical forms for answer extractions and how terminology extraction becomes an important part of the system

ZORA