74 research outputs found

    Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization

    Full text link
    Generating a text abstract from a set of documents remains a challenging task. The neural encoder-decoder framework has recently been exploited to summarize single documents, but its success can in part be attributed to the availability of large parallel data automatically acquired from the Web. In contrast, parallel data for multi-document summarization are scarce and costly to obtain. There is a pressing need to adapt an encoder-decoder model trained on single-document summarization data to work with multiple-document input. In this paper, we present an initial investigation into a novel adaptation method. It exploits the maximal marginal relevance method to select representative sentences from multi-document input, and leverages an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary. The adaptation method is robust and itself requires no training data. Our system compares favorably to state-of-the-art extractive and abstractive approaches judged by automatic metrics and human assessors.Comment: 11 page

    A survey on opinion summarization technique s for social media

    Get PDF
    The volume of data on the social media is huge and even keeps increasing. The need for efficient processing of this extensive information resulted in increasing research interest in knowledge engineering tasks such as Opinion Summarization. This survey shows the current opinion summarization challenges for social media, then the necessary pre-summarization steps like preprocessing, features extraction, noise elimination, and handling of synonym features. Next, it covers the various approaches used in opinion summarization like Visualization, Abstractive, Aspect based, Query-focused, Real Time, Update Summarization, and highlight other Opinion Summarization approaches such as Contrastive, Concept-based, Community Detection, Domain Specific, Bilingual, Social Bookmarking, and Social Media Sampling. It covers the different datasets used in opinion summarization and future work suggested in each technique. Finally, it provides different ways for evaluating opinion summarization

    Question-driven text summarization with extractive-abstractive frameworks

    Get PDF
    Automatic Text Summarisation (ATS) is becoming increasingly important due to the exponential growth of textual content on the Internet. The primary goal of an ATS system is to generate a condensed version of the key aspects in the input document while minimizing redundancy. ATS approaches are extractive, abstractive, or hybrid. The extractive approach selects the most important sentences in the input document(s) and then concatenates them to form the summary. The abstractive approach represents the input document(s) in an intermediate form and then constructs the summary using different sentences than the originals. The hybrid approach combines both the extractive and abstractive approaches. The query-based ATS selects the information that is most relevant to the initial search query. Question-driven ATS is a technique to produce concise and informative answers to specific questions using a document collection. In this thesis, a novel hybrid framework is proposed for question-driven ATS taking advantage of extractive and abstractive summarisation mechanisms. The framework consists of complementary modules that work together to generate an effective summary: (1) discovering appropriate non-redundant sentences as plausible answers using a multi-hop question answering system based on a Convolutional Neural Network (CNN), multi-head attention mechanism and reasoning process; and (2) a novel paraphrasing Generative Adversarial Network (GAN) model based on transformers rewrites the extracted sentences in an abstractive setup. In addition, a fusing mechanism is proposed for compressing the sentence pairs selected by a next sentence prediction model in the paraphrased summary. Extensive experiments on various datasets are performed, and the results show the model can outperform many question-driven and query-based baseline methods. The proposed model is adaptable to generate summaries for the questions in the closed domain and open domain. An online summariser demo is designed based on the proposed model for the industry use to process the technical text

    Feature-aware conditional GAN for category text generation

    Full text link
    Category text generation receives considerable attentions since it is beneficial for various natural language processing tasks. Recently, the generative adversarial network (GAN) has attained promising performance in text generation, attributed to its adversarial training process. However, there are several issues in text GANs, including discreteness, training instability, mode collapse, lack of diversity and controllability etc. To address these issues, this paper proposes a novel GAN framework, the feature-aware conditional GAN (FA-GAN), for controllable category text generation. In FA-GAN, the generator has a sequence-to-sequence structure for improving sentence diversity, which consists of three encoders including a special feature-aware encoder and a category-aware encoder, and one relational-memory-core-based decoder with the Gumbel SoftMax activation function. The discriminator has an additional category classification head. To generate sentences with specified categories, the multi-class classification loss is supplemented in the adversarial training. Comprehensive experiments have been conducted, and the results show that FA-GAN consistently outperforms 10 state-of-the-art text generation approaches on 6 text classification datasets. The case study demonstrates that the synthetic sentences generated by FA-GAN can match the required categories and are aware of the features of conditioned sentences, with good readability, fluency, and text authenticity.Comment: 27 pages, 8 figure
    • …
    corecore