16 research outputs found

    On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

    Full text link
    We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper

    Abstractive Summarization with Efficient Transformer Based Approach

    Get PDF
    One of the most significant research areas is how to make a document smaller while keeping its essential information because of the rapid proliferation of online data. This information must be summarized in order to recover meaningful knowledge in an acceptable time. Text summarization is what it's called. Extractive and abstractive text summarization are the two types of summarization. In current years, the arena of abstractive text summarization has become increasingly popular. Abstractive Text Summarization (ATS) aims to extract the most vital content from a text corpus and condense it into a shorter text while maintaining its meaning and semantic and grammatical accuracy. Deep learning architectures have entered a new phase in natural language processing (NLP). Many studies have demonstrated the competitive performance of innovative architectures including recurrent neural network (RNN), Attention Mechanism and LSTM among others. Transformer, a recently presented model, relies on the attention process. In this paper, abstractive text summarization is accomplished using a basic Transformer model, a Transformer with a pointer generation network (PGN) and coverage mechanism, a Fastformer architecture and Fastformer with pointer generation network (PGN) and coverage mechanism. We compare these architectures after careful and thorough hyperparameter adjustment. In the experiment the standard CNN/DM dataset is used to test these architectures on the job of abstractive summarization

    Sparsifying Transformer Models with Trainable Representation Pooling

    Full text link
    We propose a novel method to sparsify attention in the Transformer model by learning to select the most-informative token representations during the training process, thus focusing on task-specific parts of the input. A reduction of quadratic time and memory complexity to sublinear was achieved due to a robust trainable top-k operator. For example, our experiments on a challenging summarization task of long documents show that our method is over 3 times faster and up to 16 times more memory efficient while significantly outperforming both dense and state-of-the-art sparse transformer models. The method can be effortlessly applied to many models used in NLP and CV, simultaneously with other improvements.Comment: Provided formal overview. Reevaluated with Google Research scrip

    Retrieval-based Goal-Oriented Dialogue Generation

    Get PDF
    Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue. In this work, we explore a hybrid approach to goal-oriented dialogue generation that combines retrieval from past history with a hierarchical, neural encoder-decoder architecture. We evaluate this approach in the customer support domain using the Multiwoz dataset (Budzianowski et al., 2018). We show that adding this retrieval step to a hierarchical, neural encoder-decoder architecture leads to significant improvements, including responses that are rated more appropriate and fluent by human evaluators. Finally, we compare our retrieval-based model to various semantically conditioned models explicitly using past dialog act information, and find that our proposed model is competitive with the current state of the art (Chen et al., 2019), while not requiring explicit labels about past machine acts

    Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

    Get PDF
    Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets

    Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

    Get PDF
    Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets