Search CORE

23 research outputs found

A derivational rephrasing experiment for question answering

Author: Jacquemin Bernard
Publication venue
Publication date: 17/05/2010
Field of study

In Knowledge Management, variations in information expressions have proven a real challenge. In particular, classical semantic relations (e.g. synonymy) do not connect words with different parts-of-speech. The method proposed tries to address this issue. It consists in building a derivational resource from a morphological derivation tool together with derivational guidelines from a dictionary in order to store only correct derivatives. This resource, combined with a syntactic parser, a semantic disambiguator and some derivational patterns, helps to reformulate an original sentence while keeping the initial meaning in a convincing manner This approach has been evaluated in three different ways: the precision of the derivatives produced from a lemma; its ability to provide well-formed reformulations from an original sentence, preserving the initial meaning; its impact on the results coping with a real issue, ie a question answering task . The evaluation of this approach through a question answering system shows the pros and cons of this system, while foreshadowing some interesting future developments

arXiv.org e-Print Archive

Google Scholar

Author: Noël Élisabeth
Publication venue
Publication date
Field of study

Bibliothèque numérique de l'enssib

Ontologies and Information Extraction

Author: Nazarenko Adeline
Nédellec Claire
Publication venue
Publication date: 01/01/2005
Field of study

This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

arXiv.org e-Print Archive

HAL Descartes

HAL-Paris 13

Tools for Terminology Processing

Author: Daille Béatrice
Enguehard Chantal
Morin Emmanuel
Publication venue: Tata McGraw-Hill
Publication date: 01/06/2002
Field of study

International audienceAutomatic terminology processing appeared 10 years ago when electronic corpora became widely available. Such processing may be statistically or linguistically based and produces terminology resources that can be used in a number of applications : indexing, information retrieval, technology watch, etc. We present the tools that have been developed in the IRIN Institute. They all take as input texts (or collection of texts) and reflect different states of terminology processing: term acquisition, term recognition and term structuring

Recherche d'information médicale pour le patient Impact de ressources terminologiques

Author: Claveau Vincent
Grabar Natalia
Hamon Thierry
Le Maguer Sébastien
Publication venue: HAL CCSD
Publication date: 18/03/2015
Field of study

National audienceABSTRACT. The right of patients to access their clinical health record is granted by the code of Santé Publique. Yet, this content remain difficult to understand. We propose an experience, in which we use queries defined by patients in order to find relevant documents. We utilise the Indri search engine, based on statistical language modeling and semantic resources. We stress the point related to the terminological variation (e.g. synonyms, abbreviations) to make the link between expert and patient languages. Various combinations of resources and Indri settings are explored, mostly based on query expansion. Our system shows up to 0.7660 P@10 and up to 0.6793 [email protected]ÉSUMÉ. Le droit d'accès au dossier clinique par les patients est inscrit dans le code de Santé Publique. Cependant, ce contenu reste difficile à comprendre. Nous proposons une expérience, où les requêtes des patients sont utilisées pour retrouver les documents pertinents. Nous util-isons le moteur de recherche Indri, basé sur le modèle statistique de la langue, et des ressources sémantiques. L'accent est mis sur la variation terminologique (e.g. synonymes, abréviations) pour faire le lien entre la langue des experts et des patients. Différentes combinaisons de ressources et du paramétrage de Indri sont testées, essentiellement à travers l'expansion des requêtes. Notre système montre jusqu'à 0,7660 de P@10 et 0,6793 de NDCG@10

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Paris 13

Hal-Diderot

HAL-Rennes 1

Cederilic : constitution d'un livret d'un index numérique

Author: Adeline Nazarenko
Baruk Toledano
Didier Bourigault
Jean Charlet
Régine Teulier
Touria Aït El Mekki
Publication venue
Publication date
Field of study

Nous décrivons une expérience en grandeur réelle de constitution d'un index thématique pour un ouvrage scientifique. Cet ouvrage est constitué d'une sélection de vingt-et-un articles de trois éditions des journées Ingénierie des connaissances (1999-2001). Ce corpus a été traité par l'analyseur SYNTEX puis par le système INDDOC, logiciel dédié à la constitution d'index. Ce travail a été réalisé dans un contexte entièrement numérique, c'est-à-dire à partir de fichiers numériques et pour constituer la collection des articles de l'ouvrage en un ensemble de fichiers HTML au sein duquel l'utilisateur navigue via un navigateur. Nous présentons les principaux problèmes rencontrés et les solutions adoptées.ingénierie des connaissances;livre numérique;indexation;acquisition des connaissances à partir de textes;structuration de terminologie;XML;DTD Docbook

Research Papers in Economics

Pattern Based Term Extraction Using ACABIT System

Author: Daille Béatrice
Kageura Kyo
Koyama Teruo
Romary Laurent
Takeuchi Koichi
Publication venue: 'The Institute of Electronics, Information and Communication Engineers'
Publication date: 01/01/2003
Field of study

International audienceIn this paper, we propose a pattern-based term extraction approach for Japanese, applying ACABIT system originally developed for French. The proposed approach evaluates termhood using morphological patterns of basic terms and term variants. After extracting term candidates, ACABIT system filters out non-terms from the candidates based on log-likelihood. This approach is suitable for Japanese term extraction because most of Japanese terms are compound nouns or simple phrasal patterns

INRIA a CCSD electronic archive server

Sélection de réponses à des questions dans un corpus Web par validation

Author: Falco MH
Grappy Arnaud
Grau Brigitte
Ligozat Anne-Laure
Robba I
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/06/2011
Field of study

National audienceLes systèmes de questions réponses recherchent la réponse à une question posée en langue naturelle dans un ensemble de documents. Les collections Web diffèrent des articles de journaux de par leurs structures et leur style. Pour tenir compte de ces spécificités nous avons développé un système fondé sur une approche robuste de validation où des réponses candidates sont extraites à partir de courts passages textuels puis ordonnées par apprentissage. Les résultats montrent une amélioration du MRR (Mean Reciprocal Rank) de 48% par rapport à la baseline