5 research outputs found
ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
Scientific article summarization is challenging: large, annotated corpora are
not available, and the summary should ideally include the article's impacts on
research community. This paper provides novel solutions to these two
challenges. We 1) develop and release the first large-scale manually-annotated
corpus for scientific papers (on computational linguistics) by enabling faster
annotation, and 2) propose summarization methods that integrate the authors'
original highlights (abstract) and the article's actual impacts on the
community (citations), to create comprehensive, hybrid summaries. We conduct
experiments to demonstrate the efficacy of our corpus in training data-driven
models for scientific paper summarization and the advantage of our hybrid
summaries over abstracts and traditional citation-based summaries. Our large
annotated corpus and hybrid methods provide a new framework for scientific
paper summarization research.Comment: AAAI 201