2 research outputs found

    Clustering algorithms and their effect on edge preservation in image compression

    Get PDF
    Image compression aims to reduce the amount of data that is stored or transmitted for images. One technique that may be used to this end is vector quantization. Vectors may be used to represent images. Vector quantization reduces the number of vectors required for an image by representing a cluster of similar vectors by one typical vector that is part of a set of vectors referred to as the code book. For compression, for each image vector, only the closest codebook vector is stored or transmitted. For reconstruction, the image vectors are again replaced by the the closest codebook vectors. Hence vector quantization is a lossy compression technique and the quality of the reconstructed image depends strongly on the quality of the codebook. The design of the codebook is therefore an important part of the process. In this thesis we examine three clustering algorithms which can be used for codebook design in image compression: c-means (CM), fuzzy c-means (FCM) and learning vector quantization (LVQ). We give a description of these algorithms and their application to codebook design. Edges are an important part of the visual information contained in an image. It is essential therefore to use codebooks which allow an accurate representation of the edges. One of the shortcomings of using vector quantization is poor edge representation. We therefore carry out experiments using these algorithms to compare their edge preserving qualities. We also investigate the combination of these algorithms with classified vector quantization (CVQ) and the replication method (RM). Both these methods have been suggested as methods for improving edge representation. We use a cross validation approach to estimate the mean squared error to measure the performance of each of the algorithms and the edge preserving methods. The results reflect that the edges are less accurately represented than the non - edge areas when using CM, FCM and LVQ. The advantage of using CVQ is that the time taken for code book design is reduced particularly for CM and FCM. RM is found to be effective where the codebook is trained using a set that has larger proportions of edges than the test set

    Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large-Scale Datasets

    Get PDF
    The technological developments of the last twenty years are leading the world to a new era. The invention of the internet, mobile phones and smart devices are resulting in an exponential increase in data. As the data is growing every day, finding similar patterns or matching samples to a query is no longer a simple task because of its computational costs and storage limitations. Special signal processing techniques are required in order to handle the growth in data, as simply adding more and more computers cannot keep up.Nearest neighbor search, or similarity search, proximity search or near item search is the problem of finding an item that is nearest or most similar to a query according to a distance or similarity measure. When the reference set is very large, or the distance or similarity calculation is complex, performing the nearest neighbor search can be computationally demanding. Considering today’s ever-growing datasets, where the cardinality of samples also keep increasing, a growing interest towards approximate methods has emerged in the research community.Vector Quantization for Approximate Nearest Neighbor Search (VQ for ANN) has proven to be one of the most efficient and successful methods targeting the aforementioned problem. It proposes to compress vectors into binary strings and approximate the distances between vectors using look-up tables. With this approach, the approximation of distances is very fast, while the storage space requirement of the dataset is minimized thanks to the extreme compression levels. The distance approximation performance of VQ for ANN has been shown to be sufficiently well for retrieval and classification tasks demonstrating that VQ for ANN techniques can be a good replacement for exact distance calculation methods.This thesis contributes to VQ for ANN literature by proposing five advanced techniques, which aim to provide fast and efficient approximate nearest neighbor search on very large-scale datasets. The proposed methods can be divided into two groups. The first group consists of two techniques, which propose to introduce subspace clustering to VQ for ANN. These methods are shown to give the state-of-the-art performance according to tests on prevalent large-scale benchmarks. The second group consists of three methods, which propose improvements on residual vector quantization. These methods are also shown to outperform their predecessors. Apart from these, a sixth contribution in this thesis is a demonstration of VQ for ANN in an application of image classification on large-scale datasets. It is shown that a k-NN classifier based on VQ for ANN performs on par with the k-NN classifiers, but requires much less storage space and computations
    corecore