15 research outputs found

    Extractive Research Slide Generation Using Windowed Labeling Ranking

    Get PDF
    Presentation slides generated from original research papers provide an efficient form to present research innovations. Manually generating presentation slides is labor-intensive. We propose a method to automatically generates slides for scientific articles based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites. The sentence labeling module of our method is based on SummaRuNNer, a neural sequence model for extractive summarization. Instead of ranking sentences based on semantic similarities in the whole document, our algorithm measures the importance and novelty of sentences by combining semantic and lexical features within a sentence window. Our method outperforms several baseline methods including SummaRuNNer by a significant margin in terms of ROUGE score

    Towards Personalized and Human-in-the-Loop Document Summarization

    Full text link
    The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.Comment: PhD thesi

    On Semi-Automatic Creation of Dataset for Multi-Document Automatic Summarization of News Articles and Forum Threads

    Get PDF
    The problem of semi-automatic dataset creation for multi-document summarization and forum threads summarization is analyzed. Aspects specific to Slavic languages are underlined. Dedicated algorithms for this purpose were designed and tested. Due to not smooth nature of the optimization problem genetic algorithms were suggested. Some new and interesting results are received

    Dual contextual module for neural machine translation

    Get PDF

    A Spoken Dialogue System for Enabling Comfortable Information Acquisition and Consumption

    Get PDF
    早大学位記番号:新8137早稲田大
    corecore