340 research outputs found

    Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge

    Full text link
    Citation texts are sometimes not very informative or in some cases inaccurate by themselves; they need the appropriate context from the referenced paper to reflect its exact contributions. To address this problem, we propose an unsupervised model that uses distributed representation of words as well as domain knowledge to extract the appropriate context from the reference paper. Evaluation results show the effectiveness of our model by significantly outperforming the state-of-the-art. We furthermore demonstrate how an effective contextualization method results in improving citation-based summarization of the scientific articles.Comment: SIGIR 201

    Data Mining Oriented Automatic Scientific Documents Summarization

    Get PDF
    The scientific research process usually begins with an examination of the advanced, which may include voluminous publications. Summarizing scientific articles can assist researchers in their research by speeding up the research process. The summary of scientific articles differs from the abstract text in general due to its specific structure and the inclusion of cited sentences. Most of the important information in scientific articles is presented in tables, statistics, and algorithm pseudocode. These features, however, rarely appear in the standard text. Therefore, a number of methods that consider the value of the structure of a scientific article have been suggested that improve the standard of the produced summary. This paper makes use of clustering algorithms to handle CL- SciSumm 2020 and longsumm 2020 tasks for summarization of scientific documents. There are three well-known clustering algorithms that are employed to tackle CL- SciSumm 2020 and LongSumm 2020 tasks, and several sentences recording functions, with textual deduction, are used to retrieved phrases from each cluster to generate summary

    End-to-end Training For Financial Report Summarization

    Get PDF
    Quoted companies are requested to periodically publish financial reports in textual form. The annual financial reports typically include detailed financial and business information, thus giving relevant insights into company outlooks. However, a manual exploration of these financial reports could be very time consuming since most of the available information can be deemed as non-informative or redundant by expert readers. Hence, an increasing research interest has been devoted to automatically extracting domain-specific summaries, which include only the most relevant information. This paper describes the SumTO system architecture, which addresses the Shared Task of the Financial Narrative Summarisation (FNS) 2020 contest. The main task objective is to automatically extract the most informative, domain-specific textual content from financial, English-written documents. The aim is to create a summary of each company report covering all the business-relevant key points. To address the above-mentioned goal, we propose an end-to-end training method relying on Deep NLP techniques. The idea behind the system is to exploit the syntactic overlap between input sentences and ground-truth summaries to fine-tune pre-trained BERT embedding models, thus making such models tailored to the specific context. The achieved results confirm the effectiveness of the proposed method, especially when the goal is to select relatively long text snippets
    corecore