Search CORE

16 research outputs found

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Author: Li Raymond
Pal Christopher
Pilault Jonathan
Subramanian Sandeep
Publication venue
Publication date: 01/01/2020
Field of study

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper

arXiv.org e-Print Archive

Crossref

PolyPublie

Abstractive Summarization with Efficient Transformer Based Approach

Author: Karnik Madhuri P.
Kodavade D.V.
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 04/05/2023
Field of study

One of the most significant research areas is how to make a document smaller while keeping its essential information because of the rapid proliferation of online data. This information must be summarized in order to recover meaningful knowledge in an acceptable time. Text summarization is what it's called. Extractive and abstractive text summarization are the two types of summarization. In current years, the arena of abstractive text summarization has become increasingly popular. Abstractive Text Summarization (ATS) aims to extract the most vital content from a text corpus and condense it into a shorter text while maintaining its meaning and semantic and grammatical accuracy. Deep learning architectures have entered a new phase in natural language processing (NLP). Many studies have demonstrated the competitive performance of innovative architectures including recurrent neural network (RNN), Attention Mechanism and LSTM among others. Transformer, a recently presented model, relies on the attention process. In this paper, abstractive text summarization is accomplished using a basic Transformer model, a Transformer with a pointer generation network (PGN) and coverage mechanism, a Fastformer architecture and Fastformer with pointer generation network (PGN) and coverage mechanism. We compare these architectures after careful and thorough hyperparameter adjustment. In the experiment the standard CNN/DM dataset is used to test these architectures on the job of abstractive summarization

International Journal on Recent and Innovation Trends in Computing and Communication

Sparsifying Transformer Models with Trainable Representation Pooling

Author: Borchmann Łukasz
Garncarek Łukasz
Pietruszka Michał
Publication venue
Publication date: 04/02/2021
Field of study

We propose a novel method to sparsify attention in the Transformer model by learning to select the most-informative token representations during the training process, thus focusing on task-specific parts of the input. A reduction of quadratic time and memory complexity to sublinear was achieved due to a robust trainable top-k operator. For example, our experiments on a challenging summarization task of long documents show that our method is over 3 times faster and up to 16 times more memory efficient while significantly outperforming both dense and state-of-the-art sparse transformer models. The method can be effortlessly applied to many models used in NLP and CV, simultaneously with other improvements.Comment: Provided formal overview. Reevaluated with Google Research scrip

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Retrieval-based Goal-Oriented Dialogue Generation

Author: Augenstein Isabelle
Gonzalez Ana Valeria
Søgaard Anders
Publication venue
Publication date: 30/09/2019
Field of study

Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue. In this work, we explore a hybrid approach to goal-oriented dialogue generation that combines retrieval from past history with a hierarchical, neural encoder-decoder architecture. We evaluate this approach in the customer support domain using the Multiwoz dataset (Budzianowski et al., 2018). We show that adding this retrieval step to a hierarchical, neural encoder-decoder architecture leads to significant improvements, including responses that are rated more appropriate and fluent by human evaluators. Finally, we compare our retrieval-based model to various semantically conditioned models explicitly using past dialog act information, and find that our proposed model is competitive with the current state of the art (Chen et al., 2019), while not requiring explicit labels about past machine acts

arXiv.org e-Print Archive

Copenhagen University Research Information System

Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

Author: Gokhan Tuba
Lee Mark
Price Malcolm james
Publication venue
Publication date: 01/07/2024
Field of study

Effective summarization of long documents is a challenging task. When addressing this challenge, Graph and Cluster-Based methods stand out as effective unsupervised solutions. Graph-Based Unsupervised methods are widely employed for summarization due to their success in identifying relationships within documents. Cluster-Based methods excel in minimizing redundancy by grouping similar content together before generating a concise summary. Therefore, this paper merges Cluster-Based and Graph-Based methods by applying language models for Unsupervised Extractive Summarization of long documents. The approach simultaneously extracts key information while minimizing redundancy. First, we use BERT-based sentence embeddings to create sentence clusters using k-means clustering and select the optimum number of clusters using the elbow method to ensure that sentences are categorized based on their semantic similarities. Then, the TextRank algorithm is employed within each cluster to rank sentences based on their importance and representativeness. Finally, the total similarity score of the graph is used to rank the clusters and eliminate less important sentence groups. Our method achieves comparable or better summary quality and reduced redundancy compared to both individual Cluster-Based and Graph-Based methods, as well as other supervised and Unsupervised baseline models across diverse datasets

University of Birmingham Research Portal

Graphs in clusters: a hybrid approach to unsupervised extractive long document summarization using language models

Author: Gokhan Tuba
Lee Mark
Price Malcolm james
Publication venue
Publication date: 01/07/2024
Field of study

University of Birmingham Research Portal