Text Mining tools for extracting knowledge from firms annual reports

Abstract

This paper has been developed in the frame of the European project BLUE-ETS (Economic and Trade Statistics), in the work-package devoted to propose new tools for collecting and analysing data. In order to obtain business information by documentary repositories, here we refer to documents produced with non statistical aims. The use of secondary sources, typical of data and text mining, is an opportunity not sufficiently explored by National Statistical Institutes. NSIs aim at collecting and representing information in a usable and easy-readable way. The use of textual data has been still viewed as too problematic, because of the complexity and the expensiveness of the pre-processing procedures and often for the lack of suitable analytical tools. Our aim is to identify statistical linguistic sources by a deep analysis of one management commentary. From a methodological viewpoint, here we propose a tool for exploring relations between words at a micro-data level, derived from network data analysis, namely ego networks, applied together with lexical correspondence analysis

    Similar works