526 research outputs found

    Abstractive and mixed summarization for long-single documents

    Full text link
    The lack of diversity in the datasets available for automatic summarization of documents has meant that the vast majority of neural models for automatic summarization have been trained with news articles. These datasets are relatively small, with an average size of about 600 words, and the models trained with such data sets see their performance limited to short documents. In order to surmount this problem, this paper uses scientific papers as the dataset on which different models are trained. These models have been chosen based on their performance on the CNN/Daily Mail data set, so that the highest ranked model of each architectural variant is selected. In this work, six different models are compared, two with an RNN architecture, one with a CNN architecture, two with a Transformer architecture and one with a Transformer architecture combined with reinforcement learning. The results from this work show that those models that use a hierarchical encoder to model the structure of the document has a better performance than the rest

    A Deep Reinforced Model for Abstractive Summarization

    Full text link
    Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent phrases. We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). Models trained only with supervised learning often exhibit "exposure bias" - they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable. We evaluate this model on the CNN/Daily Mail and New York Times datasets. Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, an improvement over previous state-of-the-art models. Human evaluation also shows that our model produces higher quality summaries

    Abstractive Summarization of Reddit Posts with Multi-level Memory Networks

    Full text link
    We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models.Comment: Published in NAACL-HLT 2019 (Oral

    A Divide-and-Conquer Approach to the Summarization of Long Documents

    Full text link
    We present a novel divide-and-conquer method for the neural summarization of long documents. Our method exploits the discourse structure of the document and uses sentence similarity to split the problem into an ensemble of smaller summarization problems. In particular, we break a long document and its summary into multiple source-target pairs, which are used for training a model that learns to summarize each part of the document separately. These partial summaries are then combined in order to produce a final complete summary. With this approach we can decompose the problem of long document summarization into smaller and simpler problems, reducing computational complexity and creating more training examples, which at the same time contain less noise in the target summaries compared to the standard approach. We demonstrate that this approach paired with different summarization models, including sequence-to-sequence RNNs and Transformers, can lead to improved summarization performance. Our best models achieve results that are on par with the state-of-the-art in two two publicly available datasets of academic articles

    Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

    Full text link
    Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. We use a novel sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. Empirically, we achieve the new state-of-the-art on all metrics (including human evaluation) on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores. Moreover, by first operating at the sentence-level and then the word-level, we enable parallel decoding of our neural generative model that results in substantially faster (10-20x) inference speed as well as 4x faster training convergence than previous long-paragraph encoder-decoder models. We also demonstrate the generalization of our model on the test-only DUC-2002 dataset, where we achieve higher scores than a state-of-the-art model.Comment: ACL 2018 (17 pages

    What comes next? Extractive summarization by next-sentence prediction

    Full text link
    Existing approaches to automatic summarization assume that a length limit for the summary is given, and view content selection as an optimization problem to maximize informativeness and minimize redundancy within this budget. This framework ignores the fact that human-written summaries have rich internal structure which can be exploited to train a summarization system. We present NEXTSUM, a novel approach to summarization based on a model that predicts the next sentence to include in the summary using not only the source article, but also the summary produced so far. We show that such a model successfully captures summary-specific discourse moves, and leads to better content selection performance, in addition to automatically predicting how long the target summary should be. We perform experiments on the New York Times Annotated Corpus of summaries, where NEXTSUM outperforms lead and content-model summarization baselines by significant margins. We also show that the lengths of summaries produced by our system correlates with the lengths of the human-written gold standards

    Closed-Book Training to Improve Summarization Encoder Memory

    Full text link
    A good neural sequence-to-sequence summarization model should have a strong encoder that can distill and memorize the important information from long input texts so that the decoder can generate salient summaries based on the encoder's memory. In this paper, we aim to improve the memorization capabilities of the encoder of a pointer-generator model by adding an additional 'closed-book' decoder without attention and pointer mechanisms. Such a decoder forces the encoder to be more selective in the information encoded in its memory state because the decoder can't rely on the extra information provided by the attention and possibly copy modules, and hence improves the entire model. On the CNN/Daily Mail dataset, our 2-decoder model outperforms the baseline significantly in terms of ROUGE and METEOR metrics, for both cross-entropy and reinforced setups (and on human evaluation). Moreover, our model also achieves higher scores in a test-only DUC-2002 generalizability setup. We further present a memory ability test, two saliency metrics, as well as several sanity-check ablations (based on fixed-encoder, gradient-flow cut, and model capacity) to prove that the encoder of our 2-decoder model does in fact learn stronger memory representations than the baseline encoder.Comment: EMNLP 2018 (16 pages

    CQASUMM: Building References for Community Question Answering Summarization Corpora

    Full text link
    Community Question Answering forums such as Quora, Stackoverflow are rich knowledge resources, often catering to information on topics overlooked by major search engines. Answers submitted to these forums are often elaborated, contain spam, are marred by slurs and business promotions. It is difficult for a reader to go through numerous such answers to gauge community opinion. As a result summarization becomes a prioritized task for CQA forums. While a number of efforts have been made to summarize factoid CQA, little work exists in summarizing non-factoid CQA. We believe this is due to the lack of a considerably large, annotated dataset for CQA summarization. We create CQASUMM, the first huge annotated CQA summarization dataset by filtering the 4.4 million Yahoo! Answers L6 dataset. We sample threads where the best answer can double up as a reference summary and build hundred word summaries from them. We treat other answers as candidates documents for summarization. We provide a script to generate the dataset and introduce the new task of Community Question Answering Summarization. Multi document summarization has been widely studied with news article datasets, especially in the DUC and TAC challenges using news corpora. However documents in CQA have higher variance, contradicting opinion and lesser amount of overlap. We compare the popular multi document summarization techniques and evaluate their performance on our CQA corpora. We look into the state-of-the-art and understand the cases where existing multi document summarizers (MDS) fail. We find that most MDS workflows are built for the entirely factual news corpora, whereas our corpus has a fair share of opinion based instances too. We therefore introduce OpinioSumm, a new MDS which outperforms the best baseline by 4.6% w.r.t ROUGE-1 score.Comment: Accepted in CODS-COMAD'19 , Jan 3-5, WB, Indi

    PerSum: Novel Systems for Document Summarization in Persian

    Full text link
    In this paper we explore the problem of document summarization in Persian language from two distinct angles. In our first approach, we modify a popular and widely cited Persian document summarization framework to see how it works on a realistic corpus of news articles. Human evaluation on generated summaries shows that graph-based methods perform better than the modified systems. We carry this intuition forward in our second approach, and probe deeper into the nature of graph-based systems by designing several summarizers based on centrality measures. Ad hoc evaluation using ROUGE score on these summarizers suggests that there is a small class of centrality measures that perform better than three strong unsupervised baselines.Comment: 42 pages, 9 figure

    A novel repetition normalized adversarial reward for headline generation

    Full text link
    While reinforcement learning can effectively improve language generation models, it often suffers from generating incoherent and repetitive phrases \cite{paulus2017deep}. In this paper, we propose a novel repetition normalized adversarial reward to mitigate these problems. Our repetition penalized reward can greatly reduce the repetition rate and adversarial training mitigates generating incoherent phrases. Our model significantly outperforms the baseline model on ROUGE-1\,(+3.24), ROUGE-L\,(+2.25), and a decreased repetition-rate (-4.98\%).Comment: Accepted by ICASSP 201
    • …
    corecore