Using NLP to build the hypertextuel network of a back-of-the-book index

Mekki, Touria Aït El; Nazarenko, Adeline

research

Using NLP to build the hypertextuel network of a back-of-the-book index

Authors: Touria Aït El Mekki
Adeline Nazarenko
Publication date: 1 January 2005
Publisher

Abstract

Relying on the idea that back-of-the-book indexes are traditional devices for navigation through large documents, we have developed a method to build a hypertextual network that helps the navigation in a document. Building such an hypertextual network requires selecting a list of descriptors, identifying the relevant text segments to associate with each descriptor and finally ranking the descriptors and reference segments by relevance order. We propose a specific document segmentation method and a relevance measure for information ranking. The algorithms are tested on 4 corpora (of different types and domains) without human intervention or any semantic knowledge

Similar works

Full text

Available Versions

HAL Descartes

oai:HAL:hal-00098036v1

Last time updated on 14/04/2021

Hal-Diderot

oai:HAL:hal-00098036v1

Last time updated on 14/04/2021

HAL-Paris 13

oai:HAL:hal-00098036v1

Last time updated on 11/11/2016