    Selecting answers to questions from Web documents by a robust validation process

    International audienceQuestion answering (QA) systems aim at finding answers to question posed in natural language using a collection of documents. When the collection is extracted from the Web, the structure and style of the texts are quite different from those of newspaper articles. We developed a QA system based on an answer validation process able to handle Web specificity. A large number of candidate answers are extracted from short passages in order to be validated according to question and passages characteristics. The validation module is based on a machine learning approach. It takes into account criteria characterizing both the passage and answer relevance at the surface, lexical, syntactic and semantic levels to deal with different types of texts. We present and compare results obtained for factual questions posed on a Web and on a newspaper collection. We show that our system outperforms a baseline by up to 48% in MRR

    Methods combination and ML-based re-ranking of multiple hypothesis for question-answering systems

    International audienceQuestion answering systems answer correctly to different questions because they are based on different strategies. In order to increase the number of questions which can be answered by a single process, we propose solutions to combine two question answering systems, QAVAL and RITEL. QAVAL proceeds by selecting short passages, annotates them by question terms, and then extracts from them answers which are ordered by a machine learning validation process. RITEL develops a multi-level analysis of questions and documents. Answers are extracted and ordered according to two strategies: by exploiting the redundancy of candidates and a Bayesian model. In order to merge the system results, we developed different methods either by merging passages before answer ordering, or by merging end-results. The fusion of end-results is realized by voting, merging, and by a machine learning process on answer characteristics, which lead to an improvement of the best system results of 19 %

    Sélection de réponses à des questions dans un corpus Web par validation

    National audienceLes systĂšmes de questions rĂ©ponses recherchent la rĂ©ponse Ă  une question posĂ©e en langue naturelle dans un ensemble de documents. Les collections Web diffĂšrent des articles de journaux de par leurs structures et leur style. Pour tenir compte de ces spĂ©cificitĂ©s nous avons dĂ©veloppĂ© un systĂšme fondĂ© sur une approche robuste de validation oĂč des rĂ©ponses candidates sont extraites Ă  partir de courts passages textuels puis ordonnĂ©es par apprentissage. Les rĂ©sultats montrent une amĂ©lioration du MRR (Mean Reciprocal Rank) de 48% par rapport Ă  la baseline

    Fusion des réponses de systÚmes de question-réponses.

    National audienceLes rĂ©ponses donnĂ©es par plusieurs systĂšmes de questions-rĂ©ponses proviennent de l’application de stratĂ©gies diffĂ©rentes, et de ce fait permettent de rĂ©pondre Ă  des questions diffĂ©rentes. La combinaison de ces systĂšmes vise alors Ă  accro\ⁱtre le nombre total de questions rĂ©solues. Cet article prĂ©sente la combinaison de trois systĂšmes : QAVAL, qui s’appuie sur un module de validation de rĂ©ponses et deux versions du systĂšmes RITEL qui s’appuie sur une analyse multi-niveaux appliquĂ©e aux questions et aux documents. La fusion des rĂ©sultats est effectuĂ©e de diffĂ©rentes maniĂšres : en fusionnant les passages, Ă  la sortie des systĂšmes par vote ou fusion en tenant compte du poids ou du rang des rĂ©ponses proposĂ©es et par un mĂ©canisme d’apprentissage sur les caractĂ©ristiques des rĂ©ponse