Factor PD-Co-clustering in Official Statistics

Abstract

In this paper we propose an extension of Factor PD-clustering for simultaneous classification of rows and columns of frequency matrices extracted from large textual datasets. The aim is to extract information from documents usually produced and that are not used because of their special nature. The work is carried out within the European project BLUE-ETS, which aims to provide tools for the construction of robust and high quality official statistics for businesses

    Similar works