Search CORE

19,462 research outputs found

Improving Term Extraction with Terminological Resources

Author: C.G. Chute
T.G.O. Consortium
T.G.O. Consortium
Y. Tsuruoka
Publication venue
Publication date: 01/01/2006
Field of study

Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we implemented exploits testified terms at different steps of the process: chunking, parsing and extraction of term candidates. Experiments reported here show that, using this method, more term candidates can be acquired with a higher level of reliability. We further describe the extraction process involving endogenous disambiguation implemented in the term extractor YaTeA

arXiv.org e-Print Archive

Crossref

HAL-Paris 13

Context and Keyword Extraction in Plain Text Using a Graph Representation

Author: Chahine Carlo Abi
Chaignaud Nathalie
Kotowicz Jean-Philippe
Pécuchet Jean-Pierre
Publication venue
Publication date: 30/11/2008
Field of study

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref