373,584 research outputs found
Generating Concise and Readable Summaries of XML Documents
XML has become the de-facto standard for data representation and exchange,
resulting in large scale repositories and warehouses of XML data. In order for
users to understand and explore these large collections, a summarized, bird's
eye view of the available data is a necessity. In this paper, we are interested
in semantic XML document summaries which present the "important" information
available in an XML document to the user. In the best case, such a summary is a
concise replacement for the original document itself. At the other extreme, it
should at least help the user make an informed choice as to the relevance of
the document to his needs. In this paper, we address the two main issues which
arise in producing such meaningful and concise summaries: i) which tags or text
units are important and should be included in the summary, ii) how to generate
summaries of different sizes.%for different memory budgets. We conduct user
studies with different real-life datasets and show that our methods are useful
and effective in practice
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization
Generating a text abstract from a set of documents remains a challenging
task. The neural encoder-decoder framework has recently been exploited to
summarize single documents, but its success can in part be attributed to the
availability of large parallel data automatically acquired from the Web. In
contrast, parallel data for multi-document summarization are scarce and costly
to obtain. There is a pressing need to adapt an encoder-decoder model trained
on single-document summarization data to work with multiple-document input. In
this paper, we present an initial investigation into a novel adaptation method.
It exploits the maximal marginal relevance method to select representative
sentences from multi-document input, and leverages an abstractive
encoder-decoder model to fuse disparate sentences to an abstractive summary.
The adaptation method is robust and itself requires no training data. Our
system compares favorably to state-of-the-art extractive and abstractive
approaches judged by automatic metrics and human assessors.Comment: 11 page
Deep Recurrent Generative Decoder for Abstractive Text Summarization
We propose a new framework for abstractive text summarization based on a
sequence-to-sequence oriented encoder-decoder model equipped with a deep
recurrent generative decoder (DRGN).
Latent structure information implied in the target summaries is learned based
on a recurrent latent random model for improving the summarization quality.
Neural variational inference is employed to address the intractable posterior
inference for the recurrent latent variables.
Abstractive summaries are generated based on both the generative latent
variables and the discriminative deterministic states.
Extensive experiments on some benchmark datasets in different languages show
that DRGN achieves improvements over the state-of-the-art methods.Comment: 10 pages, EMNLP 201
Controlling Output Length in Neural Encoder-Decoders
Neural encoder-decoder models have shown great success in many sequence
generation tasks. However, previous work has not investigated situations in
which we would like to control the length of encoder-decoder outputs. This
capability is crucial for applications such as text summarization, in which we
have to generate concise summaries with a desired length. In this paper, we
propose methods for controlling the output sequence length for neural
encoder-decoder models: two decoding-based methods and two learning-based
methods. Results show that our learning-based methods have the capability to
control length without degrading summary quality in a summarization task.Comment: 11 pages. To appear in EMNLP 201
- …