Information retrieval in institutional repositories using the summarization technique derived from the selection of Cassiopeia attributes/ Recuperação de informação em repositórios institucionais utilizando a técnica de sumarização a partir da seleção de atributos do Cassiopeia

Abstract

The large volume of available text documents arising from the increase in scientific output creates a need for researching and implementing methods that facilitate information search and retrieval in academic text bases, such as institutional repositories. This study’s objective is thus to analyze whether the application of the summarization technique, based on the method of selecting attributes (words) of the Cassiopeia model (implemented in the PragmaSUM summarizer), in academic texts, is helpful for retrieving information by reducing information overload and improving the accuracy of user search results. The research was developed in steps: elaboration of the reference collection; implementation of a search engine; execution of standard information retrieval; evaluation of information retrieval using the precision metric; and data analysis from Friedman ANOVA and Kendall’s Coefficient of Concordance statistical tests. Results revealed that summarization, mainly performed with high compression rates (80% and 90%), reduced information overload and increased the accuracy of the results presented to the user, allowing quality information retrieval in academic texts. Furthermore, it simplified the indexing process, attenuated high dimensionality and promoted faster information retrieval

    Similar works