Multilingual Extraction of Semantic Indexes

Calabretto, Sylvie; Harrathi, Farah; Roussey, Catherine

Multilingual Extraction of Semantic Indexes

Authors: Sylvie Calabretto
Farah Harrathi
Catherine Roussey
Publication date: 21 May 2007
Publisher: HAL CCSD

Abstract

International audienceThis article deals with multilingual document indexing. We propose an indexing method based on several stages. First of all the most important terms of the document are extracted using general characteristics of languages and statistical methods. Thus, term extraction stages can be applied to any document whatever the document language is. Secondly, our indexing method uses a multilingual ontology in order to find the most relevant concepts representing the document content. Our method can be applied to a multilingual corpus containing document written in different languages. This indexing procedure is part of a Multilingual Document System untitled SyDoM, which manages XML documents

Similar works

Full text

Available Versions

HAL

oai:HAL:hal-01563180v1

Last time updated on 01/11/2023

Hal-Diderot

oai:HAL:hal-01563180v1

Last time updated on 14/04/2021