2,635 research outputs found

    SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

    Get PDF
    Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner

    Exploring the subtopic-based relationship map strategy for multi-document summarization

    Get PDF
    In this paper we adapt and explore strategies for generating multi-document summaries based on relationship maps, which represent texts as graphs (maps) of interrelated segments and apply different traversing techniques for producing the summaries. In particular, we focus on the Segmented Bushy Path, a sophisticated method which tries to represent in a summary the main subtopics from source texts while keeping its informativeness. In addition, we also investigate some well-known subtopic segmentation and clustering techniques in order to correctly select the most relevant information to compose the final summary. We show that this subtopic-based method outperforms other methods for multi-document summarization and that achieves state of the art results, competing with the most sophisticated deep summarization methods in the area
    • …
    corecore