Location of Repository

Engineering the compression of massive tables: An experimental approach

By Adam L. Buchsbaum, Donald F. Caldwell, Kenneth W. Church, Glenn S. Fowler and S. Muthukrishnan

Abstract

We study the problem of compressing massive tables. We devise a novel compression paradigm—training for lossless compression— which assumes that the data exhibit dependencies that can be learned by examining a small amount of training material. We develop an experimental methodology to test the approach. Our result is a system, pzip, which outperforms gzip by factors of two in compression size and both compression and uncompression time for various tabular data. Pzip is now in production use in an AT&T network traffic data warehouse.

Year: 2000
OAI identifier: oai:CiteSeerX.psu:10.1.1.184.8280
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cs.princeton.edu:80... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.