26 research outputs found
An analysis of machine translation errors on the effectiveness of an Arabic-English QA system
The aim of this paper is to investigate
how much the effectiveness of a Question
Answering (QA) system was affected
by the performance of Machine
Translation (MT) based question translation.
Nearly 200 questions were selected
from TREC QA tracks and ran through a
question answering system. It was able to
answer 42.6% of the questions correctly
in a monolingual run. These questions
were then translated manually from English
into Arabic and back into English using
an MT system, and then re-applied to
the QA system. The system was able to
answer 10.2% of the translated questions.
An analysis of what sort of translation error
affected which questions was conducted,
concluding that factoid type
questions are less prone to translation error
than others
The effects of topic familiarity on user search behavior in question answering systems
This paper reports on experiments that attempt
to characterize the relationship between users
and their knowledge of the search topic in a
Question Answering (QA) system. It also
investigates user search behavior with respect
to the length of answers presented by a QA
system. Two lengths of answers were
compared; snippets (one to two sentences of
text) and exact answers. A user test was
conducted, 92 factoid questions were judged
by 44 participants, to explore the participants’
preferences, feelings and opinions about QA
system tasks. The conclusions drawn from the
results were that participants preferred and
obtained higher accuracy in finding answers
from the snippets set. However, accuracy
varied according to users’ topic familiarity;
users were only substantially helped by the
wider context of a snippet if they were already
familiar with the topic of the question, without
such familiarity, users were about as accurate
at locating answers from the snippets as they
were in exact set
Sentence Level Fact Based Search Engine: News Fact Finder
Users searching the Internet for news are not able to find relevant fact-based results for certain queries using the major search engines. Queries that require exact substring matching in order to obtain very relevant results are not currently possible. Furthermore, search engines do not discriminate in returning results that are opinions and not quantifiable facts. Our sentence level search engine, News Fact Finder, is designed using suffix arrays, filters out opinions, and produces very relevant results that are attractive to users. The News Fact Finder produces a 73% success rate of providing relevant fact based results
The Severity of Undetected Ambiguity in Software Engineering Requirements
Users searching the Internet for news are not able to find relevant fact-based results for certain queries using the major search engines. Queries that require exact substring matching in order to obtain very relevant results are not currently possible. Furthermore, search engines do not discriminate in returning results that are opinions and not quantifiable facts. Our sentence level search engine, News Fact Finder, is designed using suffix arrays, filters out opinions, and produces very relevant results that are attractive to users. The News Fact Finder produces a 73% success rate of providing relevant fact based results. Keywords
Rule Generation Based On Structural Clustering For Automatic Question Answering
In rule-based methods for Question-Answering (QA) research, typical rule discovery techniques are based on structural pattern overlapping and lexical information. These usually result in rules that may require further interpretation and rules that may be redundant. To address these issues, an automatic structural rule generation algorithm is presented via clustering, where a center sentence-based clustering method is designed to automatically generate rules for QA systems
Biomedical Question Answering: A Survey of Approaches and Challenges
Automatic Question Answering (QA) has been successfully applied in various
domains such as search engines and chatbots. Biomedical QA (BQA), as an
emerging QA task, enables innovative applications to effectively perceive,
access and understand complex biomedical knowledge. There have been tremendous
developments of BQA in the past two decades, which we classify into 5
distinctive approaches: classic, information retrieval, machine reading
comprehension, knowledge base and question entailment approaches. In this
survey, we introduce available datasets and representative methods of each BQA
approach in detail. Despite the developments, BQA systems are still immature
and rarely used in real-life settings. We identify and characterize several key
challenges in BQA that might lead to this issue, and discuss some potential
future directions to explore.Comment: In submission to ACM Computing Survey
Dublin City University at QA@CLEF 2008
We describe our participation in Multilingual Question Answering at CLEF 2008 using German and English as our source and target languages respectively. The system was built using UIMA (Unstructured Information Management Architecture) as underlying framework
Exploiting syntactic relations for question answering
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 61-66).Recently there has been a resurgent interest in syntax-based approaches to information access, as a means of overcoming the limitations of keyword-based approaches. So far attempts to use syntax have been ad hoc, choosing to use some syntactic information but still ignoring most of the tree structure. This thesis describes the design and implementation of SMARTQA, a proof-of-concept question answering system that compares syntactic trees in a principled manner. Specifically, SMARTQA uses a tree edit-distance algorithm to calculate the similarity between unordered, unrooted syntactic trees. The general case of this problem is NP-complete; in practice, SMARTQA demonstrates that an optimized implementation of the algorithm can be feasibly used for question answering applications.by Daniel Loreto.M.Eng
Enhancing factoid question answering using frame semantic-based approaches
FrameNet is used to enhance the performance of semantic QA systems. FrameNet is a linguistic resource that encapsulates Frame Semantics and provides scenario-based generalizations over lexical items that share similar semantic backgrounds.Doctor of Philosoph
Arabic named entity recognition
En esta tesis doctoral se describen las investigaciones realizadas con el objetivo de determinar
las mejores tecnicas para construir un Reconocedor de Entidades Nombradas
en Arabe. Tal sistema tendria la habilidad de identificar y clasificar las entidades
nombradas que se encuentran en un texto arabe de dominio abierto.
La tarea de Reconocimiento de Entidades Nombradas (REN) ayuda a otras tareas de
Procesamiento del Lenguaje Natural (por ejemplo, la Recuperacion de Informacion, la
Busqueda de Respuestas, la Traduccion Automatica, etc.) a lograr mejores resultados
gracias al enriquecimiento que a~nade al texto. En la literatura existen diversos trabajos
que investigan la tarea de REN para un idioma especifico o desde una perspectiva
independiente del lenguaje. Sin embargo, hasta el momento, se han publicado muy
pocos trabajos que estudien dicha tarea para el arabe.
El arabe tiene una ortografia especial y una morfologia compleja, estos aspectos aportan
nuevos desafios para la investigacion en la tarea de REN. Una investigacion completa
del REN para elarabe no solo aportaria las tecnicas necesarias para conseguir
un alto rendimiento, sino que tambien proporcionara un analisis de los errores y una
discusion sobre los resultados que benefician a la comunidad de investigadores del
REN. El objetivo principal de esta tesis es satisfacer esa necesidad. Para ello hemos:
1. Elaborado un estudio de los diferentes aspectos del arabe relacionados con dicha
tarea;
2. Analizado el estado del arte del REN;
3. Llevado a cabo una comparativa de los resultados obtenidos por diferentes
tecnicas de aprendizaje automatico;
4. Desarrollado un metodo basado en la combinacion de diferentes clasificadores,
donde cada clasificador trata con una sola clase de entidades nombradas y emplea
el conjunto de caracteristicas y la tecnica de aprendizaje automatico mas
adecuados para la clase de entidades nombradas en cuestion.
Nuestros experimentos han sido evaluados sobre nueve conjuntos de test.Benajiba, Y. (2009). Arabic named entity recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8318Palanci