26 research outputs found

    An analysis of machine translation errors on the effectiveness of an Arabic-English QA system

    Get PDF
    The aim of this paper is to investigate how much the effectiveness of a Question Answering (QA) system was affected by the performance of Machine Translation (MT) based question translation. Nearly 200 questions were selected from TREC QA tracks and ran through a question answering system. It was able to answer 42.6% of the questions correctly in a monolingual run. These questions were then translated manually from English into Arabic and back into English using an MT system, and then re-applied to the QA system. The system was able to answer 10.2% of the translated questions. An analysis of what sort of translation error affected which questions was conducted, concluding that factoid type questions are less prone to translation error than others

    The effects of topic familiarity on user search behavior in question answering systems

    Get PDF
    This paper reports on experiments that attempt to characterize the relationship between users and their knowledge of the search topic in a Question Answering (QA) system. It also investigates user search behavior with respect to the length of answers presented by a QA system. Two lengths of answers were compared; snippets (one to two sentences of text) and exact answers. A user test was conducted, 92 factoid questions were judged by 44 participants, to explore the participants’ preferences, feelings and opinions about QA system tasks. The conclusions drawn from the results were that participants preferred and obtained higher accuracy in finding answers from the snippets set. However, accuracy varied according to users’ topic familiarity; users were only substantially helped by the wider context of a snippet if they were already familiar with the topic of the question, without such familiarity, users were about as accurate at locating answers from the snippets as they were in exact set

    Sentence Level Fact Based Search Engine: News Fact Finder

    Get PDF
    Users searching the Internet for news are not able to find relevant fact-based results for certain queries using the major search engines. Queries that require exact substring matching in order to obtain very relevant results are not currently possible. Furthermore, search engines do not discriminate in returning results that are opinions and not quantifiable facts. Our sentence level search engine, News Fact Finder, is designed using suffix arrays, filters out opinions, and produces very relevant results that are attractive to users. The News Fact Finder produces a 73% success rate of providing relevant fact based results

    The Severity of Undetected Ambiguity in Software Engineering Requirements

    Get PDF
    Users searching the Internet for news are not able to find relevant fact-based results for certain queries using the major search engines. Queries that require exact substring matching in order to obtain very relevant results are not currently possible. Furthermore, search engines do not discriminate in returning results that are opinions and not quantifiable facts. Our sentence level search engine, News Fact Finder, is designed using suffix arrays, filters out opinions, and produces very relevant results that are attractive to users. The News Fact Finder produces a 73% success rate of providing relevant fact based results. Keywords

    Rule Generation Based On Structural Clustering For Automatic Question Answering

    Get PDF
    In rule-based methods for Question-Answering (QA) research, typical rule discovery techniques are based on structural pattern overlapping and lexical information. These usually result in rules that may require further interpretation and rules that may be redundant. To address these issues, an automatic structural rule generation algorithm is presented via clustering, where a center sentence-based clustering method is designed to automatically generate rules for QA systems

    Biomedical Question Answering: A Survey of Approaches and Challenges

    Full text link
    Automatic Question Answering (QA) has been successfully applied in various domains such as search engines and chatbots. Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge. There have been tremendous developments of BQA in the past two decades, which we classify into 5 distinctive approaches: classic, information retrieval, machine reading comprehension, knowledge base and question entailment approaches. In this survey, we introduce available datasets and representative methods of each BQA approach in detail. Despite the developments, BQA systems are still immature and rarely used in real-life settings. We identify and characterize several key challenges in BQA that might lead to this issue, and discuss some potential future directions to explore.Comment: In submission to ACM Computing Survey

    Dublin City University at QA@CLEF 2008

    Get PDF
    We describe our participation in Multilingual Question Answering at CLEF 2008 using German and English as our source and target languages respectively. The system was built using UIMA (Unstructured Information Management Architecture) as underlying framework

    Exploiting syntactic relations for question answering

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 61-66).Recently there has been a resurgent interest in syntax-based approaches to information access, as a means of overcoming the limitations of keyword-based approaches. So far attempts to use syntax have been ad hoc, choosing to use some syntactic information but still ignoring most of the tree structure. This thesis describes the design and implementation of SMARTQA, a proof-of-concept question answering system that compares syntactic trees in a principled manner. Specifically, SMARTQA uses a tree edit-distance algorithm to calculate the similarity between unordered, unrooted syntactic trees. The general case of this problem is NP-complete; in practice, SMARTQA demonstrates that an optimized implementation of the algorithm can be feasibly used for question answering applications.by Daniel Loreto.M.Eng

    Enhancing factoid question answering using frame semantic-based approaches

    Get PDF
    FrameNet is used to enhance the performance of semantic QA systems. FrameNet is a linguistic resource that encapsulates Frame Semantics and provides scenario-based generalizations over lexical items that share similar semantic backgrounds.Doctor of Philosoph

    Arabic named entity recognition

    Full text link
    En esta tesis doctoral se describen las investigaciones realizadas con el objetivo de determinar las mejores tecnicas para construir un Reconocedor de Entidades Nombradas en Arabe. Tal sistema tendria la habilidad de identificar y clasificar las entidades nombradas que se encuentran en un texto arabe de dominio abierto. La tarea de Reconocimiento de Entidades Nombradas (REN) ayuda a otras tareas de Procesamiento del Lenguaje Natural (por ejemplo, la Recuperacion de Informacion, la Busqueda de Respuestas, la Traduccion Automatica, etc.) a lograr mejores resultados gracias al enriquecimiento que a~nade al texto. En la literatura existen diversos trabajos que investigan la tarea de REN para un idioma especifico o desde una perspectiva independiente del lenguaje. Sin embargo, hasta el momento, se han publicado muy pocos trabajos que estudien dicha tarea para el arabe. El arabe tiene una ortografia especial y una morfologia compleja, estos aspectos aportan nuevos desafios para la investigacion en la tarea de REN. Una investigacion completa del REN para elarabe no solo aportaria las tecnicas necesarias para conseguir un alto rendimiento, sino que tambien proporcionara un analisis de los errores y una discusion sobre los resultados que benefician a la comunidad de investigadores del REN. El objetivo principal de esta tesis es satisfacer esa necesidad. Para ello hemos: 1. Elaborado un estudio de los diferentes aspectos del arabe relacionados con dicha tarea; 2. Analizado el estado del arte del REN; 3. Llevado a cabo una comparativa de los resultados obtenidos por diferentes tecnicas de aprendizaje automatico; 4. Desarrollado un metodo basado en la combinacion de diferentes clasificadores, donde cada clasificador trata con una sola clase de entidades nombradas y emplea el conjunto de caracteristicas y la tecnica de aprendizaje automatico mas adecuados para la clase de entidades nombradas en cuestion. Nuestros experimentos han sido evaluados sobre nueve conjuntos de test.Benajiba, Y. (2009). Arabic named entity recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8318Palanci