8 research outputs found
Single-Document and Multi-Document Summarization Techniques for Email Threads Using Sentence Compression First Author Affiliation / Address line 1
We present two approaches to email thread summarization: Collective Message Summarization (CMS) applies a multi-document summarization approach, while Individual Message Summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate our techniques on the Enron collection—a very challenging corpus because of the highly technical language. Results suggest that CMS represents a better approach and additional findings pave the way for future explorations.
Second Author Affiliation / Address line 1 Affiliation / Address line 2
This document contains the instructions for preparing a camera-ready manuscript for the proceedings of ACL-2014. The document itself conforms to its own specifications, and is therefore an example of what your manuscript should look like. These instructions should be used for both papers submitted for review and for final versions of accepted papers. Authors are asked to conform to all the directions reported in this document.
Unsupervised Topic Modelling for Multi-Party Spoken Discourse BLIND REVIEW NO First Author NO Affiliation / Address line 1 NO Affiliation / Address line 2
We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address the problems of topic segmentation and topic identification; automatically segmenting multiparty meetings into topically coherent segments with performance which compares well with previous unsupervised segmentation-only methods (Galley et al., 2003); while simultaneously extracting topics which rate highly when assessed for coherence by human judges. We also show that this method appears robust in the face of off-topic dialogue and speech recognition errors.
