5,346 research outputs found
Text Summarization Techniques: A Brief Survey
In recent years, there has been a explosion in the amount of text data from a
variety of sources. This volume of text is an invaluable source of information
and knowledge which needs to be effectively summarized to be useful. In this
review, the main approaches to automatic text summarization are described. We
review the different processes for summarization and describe the effectiveness
and shortcomings of the different methods.Comment: Some of references format have update
A Supervised Approach to Extractive Summarisation of Scientific Papers
Automatic summarisation is a popular approach to reduce a document to its
main arguments. Recent research in the area has focused on neural approaches to
summarisation, which can be very data-hungry. However, few large datasets exist
and none for the traditionally popular domain of scientific publications, which
opens up challenging research avenues centered on encoding large, complex
documents. In this paper, we introduce a new dataset for summarisation of
computer science publications by exploiting a large resource of author provided
summaries and show straightforward ways of extending it further. We develop
models on the dataset making use of both neural sentence encoding and
traditionally used summarisation features and show that models which encode
sentences as well as their local and global context perform best, significantly
outperforming well-established baseline methods.Comment: 11 pages, 6 figure
Hierarchical Catalogue Generation for Literature Review: A Benchmark
Multi-document scientific summarization can extract and organize important
information from an abundant collection of papers, arousing widespread
attention recently. However, existing efforts focus on producing lengthy
overviews lacking a clear and logical hierarchy. To alleviate this problem, we
present an atomic and challenging task named Hierarchical Catalogue Generation
for Literature Review (HiCatGLR), which aims to generate a hierarchical
catalogue for a review paper given various references. We carefully construct a
novel English Hierarchical Catalogues of Literature Reviews Dataset (HiCaD)
with 13.8k literature review catalogues and 120k reference papers, where we
benchmark diverse experiments via the end-to-end and pipeline methods. To
accurately assess the model performance, we design evaluation metrics for
similarity to ground truth from semantics and structure. Besides, our extensive
analyses verify the high quality of our dataset and the effectiveness of our
evaluation metrics. Furthermore, we discuss potential directions for this task
to motivate future research
- …