16 research outputs found
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
We present a method to produce abstractive summaries of long documents that
exceed several thousand words via neural abstractive summarization. We perform
a simple extractive step before generating a summary, which is then used to
condition the transformer language model on relevant information before being
tasked with generating a summary. We show that this extractive step
significantly improves summarization results. We also show that this approach
produces more abstractive summaries compared to prior work that employs a copy
mechanism while still achieving higher rouge scores. Note: The abstract above
was not written by the authors, it was generated by one of the models presented
in this paper
Abstractive Summarization with Efficient Transformer Based Approach
One of the most significant research areas is how to make a document smaller while keeping its essential information because of the rapid proliferation of online data. This information must be summarized in order to recover meaningful knowledge in an acceptable time. Text summarization is what it's called. Extractive and abstractive text summarization are the two types of summarization. In current years, the arena of abstractive text summarization has become increasingly popular. Abstractive Text Summarization (ATS) aims to extract the most vital content from a text corpus and condense it into a shorter text while maintaining its meaning and semantic and grammatical accuracy. Deep learning architectures have entered a new phase in natural language processing (NLP). Many studies have demonstrated the competitive performance of innovative architectures including recurrent neural network (RNN), Attention Mechanism and LSTM among others. Transformer, a recently presented model, relies on the attention process. In this paper, abstractive text summarization is accomplished using a basic Transformer model, a Transformer with a pointer generation network (PGN) and coverage mechanism, a Fastformer architecture and Fastformer with pointer generation network (PGN) and coverage mechanism. We compare these architectures after careful and thorough hyperparameter adjustment. In the experiment the standard CNN/DM dataset is used to test these architectures on the job of abstractive summarization
Sparsifying Transformer Models with Trainable Representation Pooling
We propose a novel method to sparsify attention in the Transformer model by
learning to select the most-informative token representations during the
training process, thus focusing on task-specific parts of the input. A
reduction of quadratic time and memory complexity to sublinear was achieved due
to a robust trainable top-k operator. For example, our experiments on a
challenging summarization task of long documents show that our method is over 3
times faster and up to 16 times more memory efficient while significantly
outperforming both dense and state-of-the-art sparse transformer models. The
method can be effortlessly applied to many models used in NLP and CV,
simultaneously with other improvements.Comment: Provided formal overview. Reevaluated with Google Research scrip
Retrieval-based Goal-Oriented Dialogue Generation
Most research on dialogue has focused either on dialogue generation for
openended chit chat or on state tracking for goal-directed dialogue. In this
work, we explore a hybrid approach to goal-oriented dialogue generation that
combines retrieval from past history with a hierarchical, neural
encoder-decoder architecture. We evaluate this approach in the customer support
domain using the Multiwoz dataset (Budzianowski et al., 2018). We show that
adding this retrieval step to a hierarchical, neural encoder-decoder
architecture leads to significant improvements, including responses that are
rated more appropriate and fluent by human evaluators. Finally, we compare our
retrieval-based model to various semantically conditioned models explicitly
using past dialog act information, and find that our proposed model is
competitive with the current state of the art (Chen et al., 2019), while not
requiring explicit labels about past machine acts
Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models
Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets
Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models
Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets