2 research outputs found
Notes about some unsupervised learning methods
We review some centroid-based algorithms derived from the basic c-Means. We survey both clustering and vector quantization. Fuzzy versions are also considered.
Theory and Practice of Vector Quantizers Trained on Small Training Sets
We examine how the performance of a memoryless vector quantizer changes as a function of its training set size. Specifically, we study how well the training set distortion predicts test distortion when the training set is a randomly drawn subset of blocks from the test or training image(s). Using the Vapnik-Chervonenkis dimension, we derive formal bounds for the difference of test and training distortion of vector quantizer codebooks. We then describe extensive empirical simulations that test these bounds for a variety of bit rates and vector dimensions, and give practical suggestions for determining the training set size necessary to achieve good generalization from a codebook. We conclude that, by using training sets comprised of only a small fraction of the available data, one can produce results that are close to the results obtainable when all available data are used. 1 Introduction Vector quantization (VQ) [7, 8] is a data compression technique that can be used to reduce the sto..