Skip to main content
Article thumbnail
Location of Repository

An Investigation Into the Detection of NewInformation

By Barry Schiffman

Abstract

Abstract This paper explores new-information detection, describing a strategy for filter-ing a stream of documents to present only information that is fresh. We focus on multi-document summarization and seek to efficiently use more linguistic informa-tion than is often seen in such systems. We experimented with our linguistic system and with a more traditional sentence-based, vector-space system and found that acombination of the two approaches boosted performance over each one alone. 1 Introduction The voluminous amount of information now in digital form poses an important chal-lenge- to distinguish new material from material in previously seen documents. The stream of news from around the world on the World Wide Web is but one form of thisdeluge of data. Data from the world financial markets, government actions, court decisions, scientific research can all be tapped, but that value will be greatly diminished ifreaders must sift through the same material over and over again

Year: 2008
OAI identifier: oai:CiteSeerX.psu:10.1.1.134.3117
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cs.columbia.edu/tec... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.