3,650 research outputs found

    Text Summarization Techniques: A Brief Survey

    Get PDF
    In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

    Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

    Get PDF
    Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets

    Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

    Get PDF
    Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets

    Survey on Multi-Document Summarization: Systematic Literature Review

    Full text link
    In this era of information technology, abundant information is available on the internet in the form of web pages and documents on any given topic. Finding the most relevant and informative content out of these huge number of documents, without spending several hours of reading has become a very challenging task. Various methods of multi-document summarization have been developed to overcome this problem. The multi-document summarization methods try to produce high-quality summaries of documents with low redundancy. This study conducts a systematic literature review of existing methods for multi-document summarization methods and provides an in-depth analysis of performance achieved by these methods. The findings of the study show that more effective methods are still required for getting higher accuracy of these methods. The study also identifies some open challenges that can gain the attention of future researchers of this domain
    • …
    corecore