Exploring a text corpus via a knowledge graph

Abstract

Semantic enrichment methods may be used to identify relevant entities in textual documents. These extracted entities are part of knowledge graphs and thus linked by semantic relationships. This work explores the idea of navigating the semantic relationships among extracted entities as a way to search a text corpus. A modular software system (including document management, semantic enrichment, data consolidation, and data integration) has been designed, to offer a visual user interface for such navigation on top of an arbitrary corpus of textual documents. The software, called arca, has been used in a real use case: to search in the book catalogue of a publishing house. The evaluation carried out with a set of potential users has shown so far the feasibility and effectiveness of the approach. Critical issues and potential limitations of the paradigm have also been found and are discussed

    Similar works