Search CORE

179 research outputs found

Mixed-source multi-document speech-to-text summarization

Author: Matos D. M. de.
Ribeiro R.
Publication venue: Coling 2008 Organizing Committee
Publication date: 01/01/2008
Field of study

Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We propose the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech- to-text summarization. In this work, we explore the possibilities offered by pho- netic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL

Automatic Summarization

Author: McKeown Kathleen
Nenkova Ani
Publication venue: ScholarlyCommons
Publication date: 01/06/2011
Field of study

It has now been 50 years since the publication of Luhn’s seminal paper on automatic summarization. During these years the practical need for automatic summarization has become increasingly urgent and numerous papers have been published on the topic. As a result, it has become harder to find a single reference that gives an overview of past efforts or a complete view of summarization tasks and necessary system components. This article attempts to fill this void by providing a comprehensive overview of research in summarization, including the more traditional efforts in sentence extraction as well as the most novel recent approaches for determining important content, for domain and genre specific summarization and for evaluation of summarization. We also discuss the challenges that remain open, in particular the need for language generation and deeper semantic understanding of language that would be necessary for future advances in the field

ScholarlyCommons@Penn

Exploring the style-technique interaction in extractive summarization of broadcast news.

Author: Christensen Heidi
Gotoh Yoshihiko
Kolluru BalaKrishna
Renals Steve
Publication venue: IEEE Signal Processing Society Press
Publication date: 01/01/2003
Field of study

In this paper we seek to explore the interaction between the style of a broadcast news story and its summarization technique. We report the performance of three different summarization techniques on broadcast news stories, which are split into planned speech and spontaneous speech. The initial results indicate that some summarization techniques work better for the documents with spontaneous speech than for those with planned speech. Even for human beings some documents are inherently difficult to summarize. We observe this correlation between degree of dif culty in summarizing and performance of the three automatic summarizers. Given the high frequency of named entities in broadcast news and even greater number of references to these named entities, we also gauge the effect of named entity and coreference resolution in a news story, on the performance of these summarizers

CiteSeerX

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

A Cascaded Broadcast News Highlighter

Author: Christensen Heidi
Gotoh Yoshihiko
Renals Steve
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper presents a fully automatic news skimming system which takes a broadcast news audio stream and provides the user with the segmented, structured and highlighted transcript. This constitutes a system with three different, cascading stages: converting the audio stream to text using an automatic speech recogniser, segmenting into utterances and stories and finally determining which utterance should be highlighted using a saliency score. Each stage must operate on the erroneous output from the previous stage in the system; an effect which is naturally amplified as the data progresses through the processing stages. We present a large corpus of transcribed broadcast news data enabling us to investigate to which degree information worth highlighting survives this cascading of processes. Both extrinsic and intrinsic experimental results indicate that mistakes in the story boundary detection has a strong impact on the quality of highlights, whereas erroneous utterance boundaries cause only minor problems. Further, the difference in transcription quality does not affect the overall performance greatly

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

Is sentence compression an NLG task?

Author: Daelemans W.
Hendrickx I.
Krahmer E.J.
Marsi E.C.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

Tilburg University Repository

On the limits of sentence compression by deletion

Author: Daelemans W.
Hendrickx I.H.E.
Krahmer E.J.
Marsi E.C.
Publication venue: SpringerLink
Publication date: 01/01/2010
Field of study

Tilburg University Repository

Preferences versus adaption during referring expression generation

Author: Goudbeek M.B.
Krahmer E.J.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

Tilburg University Repository

Linguistic challenges in automatic summarization technology

Author: Baum
Clarke
Cohn
Conroy
Conroy
Corston-Oliver
Diedrichsen
Grefenstette
Hatzivassiloglou
Jing
Jurafsky
Lewis
Mani
McDonald
Morris
Nolan
Schwartz
Yih
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 26/06/2017
Field of study

[EN] Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.Diedrichsen, E. (2017). Linguistic challenges in automatic summarization technology. Journal of Computer-Assisted Linguistic Research. 1(1):40-60. doi:10.4995/jclr.2017.7787.SWORD40601

Crossref

RiuNet

Dialogue Act Compression via Pitch Contour Preservation

Author: Murray Gabriel
Renals Steve
Publication venue
Publication date: 01/01/2006
Field of study

Edinburgh Research Explorer

From text summarisation to style-specific summarisation for broadcast news

Author: B. Kolluru
C.D. Manning
C.M. Bishop
H. Christensen
H.P. Edmundson
S. Renals
Y. Gotoh
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2004
Field of study

In this paper we report on a series of experiments investigating the path from text summarisation to style-specific summarisation of spoken news stories. We show that the portability of traditional text summarisation features to broadcast news is dependent on the diffusiveness of the information in the broadcast news story. An analysis of two categories of news stories (containing only read speech or including some spontaneous speech) demonstrates the importance of the style and the quality of the transcript, when extracting the summary-worthy information content. Further experiments indicate the advantages of doing style-specific summarisation of broadcast news

CiteSeerX

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer