7,729 research outputs found

    Generalized residual vector quantization for large scale data

    Full text link
    Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named \textit{residual vector quantization} (RVQ). Next, we propose \textit{generalized residual vector quantization} (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as the special cases of our proposed framework. We evaluate GRVQ on several large scale benchmark datasets for large scale search, classification and object retrieval. We compared GRVQ with existing methods in detail. Extensive experiments demonstrate our GRVQ framework substantially outperforms existing methods in term of quantization accuracy and computation efficiency.Comment: published on International Conference on Multimedia and Expo 201

    Optimal modeling for complex system design

    Get PDF
    The article begins with a brief introduction to the theory describing optimal data compression systems and their performance. A brief outline is then given of a representative algorithm that employs these lessons for optimal data compression system design. The implications of rate-distortion theory for practical data compression system design is then described, followed by a description of the tensions between theoretical optimality and system practicality and a discussion of common tools used in current algorithms to resolve these tensions. Next, the generalization of rate-distortion principles to the design of optimal collections of models is presented. The discussion focuses initially on data compression systems, but later widens to describe how rate-distortion theory principles generalize to model design for a wide variety of modeling applications. The article ends with a discussion of the performance benefits to be achieved using the multiple-model design algorithms

    Conditional weighted universal source codes: second order statistics in universal coding

    Get PDF
    We consider the use of second order statistics in two-stage universal source coding. Examples of two-stage universal codes include the weighted universal vector quantization (WUVQ), weighted universal bit allocation (WUBA), and weighted universal transform coding (WUTC) algorithms. The second order statistics are incorporated in two-stage universal source codes in a manner analogous to the method by which second order statistics are incorporated in entropy constrained vector quantization (ECVQ) to yield conditional ECVQ (CECVQ). In this paper, we describe an optimal two-stage conditional entropy constrained universal source code along with its associated optimal design algorithm and a fast (but nonoptimal) variation of the original code. The design technique and coding algorithm here presented result in a new family of conditional entropy constrained universal codes including but not limited to the conditional entropy constrained WUVQ (CWUVQ), the conditional entropy constrained WUBA (CWUBA), and the conditional entropy constrained WUTC (CWUTC). The fast variation of the conditional entropy constrained universal codes allows the designer to trade off performance gains against storage and delay costs. We demonstrate the performance of the proposed codes on a collection of medical brain scans. On the given data set, the CWUVQ achieves up to 7.5 dB performance improvement over variable-rate WUVQ and up to 12 dB performance improvement over ECVQ. On the same data set, the fast variation of the CWUVQ achieves identical performance to that achieved by the original code at all but the lowest rates (less than 0.125 bits per pixel)

    A vector quantization approach to universal noiseless coding and quantization

    Get PDF
    A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which the first stage code can be regarded as a vector quantizer that “quantizes” the input data of length n to one of a fixed collection of block codes. We apply the generalized Lloyd algorithm to the first-stage quantizer, using induced measures of rate and distortion, to design locally optimal two-stage codes. On a source of medical images, two-stage variable-rate vector quantizers designed in this way outperform standard (one-stage) fixed-rate vector quantizers by over 9 dB. The tail of the operational distortion-rate function of the first-stage quantizer determines the optimal rate of convergence of the redundancy of a universal sequence of two-stage codes. We show that there exist two-stage universal noiseless codes, fixed-rate quantizers, and variable-rate quantizers whose per-letter rate and distortion redundancies converge to zero as (k/2)n -1 log n, when the universe of sources has finite dimension k. This extends the achievability part of Rissanen's theorem from universal noiseless codes to universal quantizers. Further, we show that the redundancies converge as O(n-1) when the universe of sources is countable, and as O(n-1+ϵ) when the universe of sources is infinite-dimensional, under appropriate conditions
    • …
    corecore