18 research outputs found

    Adaptation of LIMSI's QALC for QA4MRE.

    Get PDF
    International audienceIn this paper, we present LIMSI participation to one of the pilot tasks of QA4MRE at CLEF 2012: Machine Reading of Biomedical Texts about Alzheimer. For this exercise, we adapted an existing question answering (QA) system, QALC, by searching answers in the reading document. This basic version was used for the evaluation and obtains 0.2, which was increased to 0.325 after basic corrections. We developed then different methods for choosing an answer, based on the expected answer type and the question plus answer rewritten to form hypothesis compared with candidates sentences. We also conducted studies on relation extraction by using an existing system. The last version of our system obtains 0.375

    Coping with Alternate Formulations of Questions and Answers

    Get PDF
    We present in this chapter the QALC system which has participated in the four TREC QA evaluations. We focus here on the problem of linguistic variation in order to be able to relate questions and answers. We present first, variation at the term level which consists in retrieving questions terms in document sentences even if morphologic, syntactic or semantic variations alter them. Our second subject matter concerns variation at the sentence level that we handle as different partial reformulations of questions. Questions are associated with extraction patterns based on the question syntactic type and the object that is under query. We present the whole system thus allowing situating how QALC deals with variation, and different evaluations

    How NLP Can Improve Question Answering

    Get PDF
    Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated to the Question Answering track of the TREC8, TREC9, and TREC10 evaluations. QALC performs an analysis of documents relying on multi-word term search and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies on the application of syntactic patterns chosen according to the kind of information that is sought for, and categorized depending on the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer leve

    Finding answers to questions, in text collections or web, in open domain or specialty domains

    Get PDF
    International audienceThis chapter is dedicated to factual question answering, i.e. extracting precise and exact answers to question given in natural language from texts. A question in natural language gives more information than a bag of word query (i.e. a query made of a list of words), and provides clues for finding precise answers. We will first focus on the presentation of the underlying problems mainly due to the existence of linguistic variations between questions and their answerable pieces of texts for selecting relevant passages and extracting reliable answers. We will first present how to answer factual question in open domain. We will also present answering questions in specialty domain as it requires dealing with semi-structured knowledge and specialized terminologies, and can lead to different applications, as information management in corporations for example. Searching answers on the Web constitutes another application frame and introduces specificities linked to Web redundancy or collaborative usage. Besides, the Web is also multilingual, and a challenging problem consists in searching answers in target language documents other than the source language of the question. For all these topics, we present main approaches and the remaining problems

    Semantic knowledge in Question-Answering systems

    Get PDF
    International audienceQA systems need semantic knowledge to find in documents variations of the question terms. They benefit from the use of knowledge resources such as synonym dictionaries or ontologies like WordNet. Our goal here is to study to which extent variations are needed and to determine what kinds of variations are useful or necessary for these systems. This study is based on different corpora in which we analyze semantic term variations, based on reference sets of possible variations

    QUERI : un système de question-réponse collaboratif et interactif

    Full text link
    Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal

    Réponse à des tests de compréhension.

    Get PDF
    National audienceDans cet article, nous présentons une adaptation d’un système de questions-réponses existant pour une tâche de réponse à des questions de compréhension de textes. La méthode proposée pour sélectionner les réponses correctes repose sur la reconnaissance d’implication textuelle entre les hypothèses et les textes. Les spécificités de cette méthode sont la génération d’hypothèses par réécriture syntaxique, et l’évaluation de plusieurs critères de distance,adaptés pour gérer des variantes de termes

    Methods combination and ML-based re-ranking of multiple hypothesis for question-answering systems

    Get PDF
    International audienceQuestion answering systems answer correctly to different questions because they are based on different strategies. In order to increase the number of questions which can be answered by a single process, we propose solutions to combine two question answering systems, QAVAL and RITEL. QAVAL proceeds by selecting short passages, annotates them by question terms, and then extracts from them answers which are ordered by a machine learning validation process. RITEL develops a multi-level analysis of questions and documents. Answers are extracted and ordered according to two strategies: by exploiting the redundancy of candidates and a Bayesian model. In order to merge the system results, we developed different methods either by merging passages before answer ordering, or by merging end-results. The fusion of end-results is realized by voting, merging, and by a machine learning process on answer characteristics, which lead to an improvement of the best system results of 19 %

    Fusion des réponses de systèmes de question-réponses.

    Get PDF
    National audienceLes réponses données par plusieurs systèmes de questions-réponses proviennent de l’application de stratégies différentes, et de ce fait permettent de répondre à des questions différentes. La combinaison de ces systèmes vise alors à accro\ⁱtre le nombre total de questions résolues. Cet article présente la combinaison de trois systèmes : QAVAL, qui s’appuie sur un module de validation de réponses et deux versions du systèmes RITEL qui s’appuie sur une analyse multi-niveaux appliquée aux questions et aux documents. La fusion des résultats est effectuée de différentes manières : en fusionnant les passages, à la sortie des systèmes par vote ou fusion en tenant compte du poids ou du rang des réponses proposées et par un mécanisme d’apprentissage sur les caractéristiques des réponse

    Questions-Réponses en domaine ouvert (sélection pertinente de documents en fonction du contexte de la question)

    Get PDF
    Les problématiques abordées dans ma thèse sont de définir une adaptation unifiée entre la sélection des documents et les stratégies de recherche de la réponse à partir du type des documents et de celui des questions, intégrer la solution au système de Questions-Réponses (QR) RITEL du LIMSI et évaluer son apport. Nous développons et étudions une méthode basée sur une approche de Recherche d Information pour la sélection de documents en QR. Celle-ci s appuie sur un modèle de langue et un modèle de classification binaire de texte en catégorie pertinent ou non pertinent d un point de vue QR. Cette méthode permet de filtrer les documents sélectionnés pour l extraction de réponses par un système QR. Nous présentons la méthode et ses modèles, et la testons dans le cadre QR à l aide de RITEL. L évaluation est faite en français en contexte web sur un corpus de 500 000 pages web et de questions factuelles fournis par le programme Quaero. Celle-ci est menée soit sur des documents complets, soit sur des segments de documents. L hypothèse suivie est que le contenu informationnel des segments est plus cohérent et facilite l extraction de réponses. Dans le premier cas, les gains obtenus sont faibles comparés aux résultats de référence (sans filtrage). Dans le second cas, les gains sont plus élevés et confortent l hypothèse, sans pour autant être significatifs. Une étude approfondie des liens existant entre les performances de RITEL et les paramètres de filtrage complète ces évaluations. Le système de segmentation créé pour travailler sur des segments est détaillé et évalué. Son évaluation nous sert à mesurer l impact de la variabilité naturelle des pages web (en taille et en contenu) sur la tâche QR, en lien avec l hypothèse précédente. En général, les résultats expérimentaux obtenus suggèrent que notre méthode aide un système QR dans sa tâche. Cependant, de nouvelles évaluations sont à mener pour rendre ces résultats significatifs, et notamment en utilisant des corpus de questions plus importants.This thesis aims at defining a unified adaptation of the document selection and answer extraction strategies, based on the document and question types, in a Question-Answering (QA) context. The solution is integrated in RITEL (a LIMSI QA system) to assess the contribution. We develop and investigate a method based on an Information Retrieval approach for the selection of relevant documents in QA. The method is based on a language model and a binary model of textual classification in relevant or irrelevant category. It is used to filter unusable documents for answer extraction by matching lists of a priori relevant documents to the question type automatically. First, we present the method along with its underlying models and we evaluate it on the QA task with RITEL in French. The evaluation is done on a corpus of 500,000 unsegmented web pages with factoid questions provided by the Quaero program (i.e. evaluation at the document level or D-level). Then, we evaluate the methodon segmented web pages (i.e. evaluation at the segment level or S-level). The idea is that information content is more consistent with segments, which facilitates answer extraction. D-filtering brings a small improvement over the baseline (no filtering). S-filtering outperforms both the baseline and D-filtering but not significantly. Finally, we study at the S-level the links between RITEL s performances and the key parameters of the method. In order to apply the method on segments, we created a system of web page segmentation. We present and evaluate it on the QA task with the same corpora used to evaluate our document selection method. This evaluation follows the former hypothesis and measures the impact of natural web page variability (in terms of size and content) on RITEL in its task. In general, the experimental results we obtained suggest that our IR-based method helps a QA system in its task, however further investigations should be conducted especially with larger corpora of questions to make them significant.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF