163 research outputs found

    Scalable Image Retrieval by Sparse Product Quantization

    Get PDF
    Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is Product Quantization, which attempts to index high-dimensional image features by decomposing the feature space into a Cartesian product of low dimensional subspaces and quantizing each of them separately. Despite the promising results reported, their quantization approach follows the typical hard assignment of traditional quantization methods, which may result in large quantization errors and thus inferior search performance. Unlike the existing approaches, in this paper, we propose a novel approach called Sparse Product Quantization (SPQ) to encoding the high-dimensional feature vectors into sparse representation. We optimize the sparse representations of the feature vectors by minimizing their quantization errors, making the resulting representation is essentially close to the original data in practice. Experiments show that the proposed SPQ technique is not only able to compress data, but also an effective encoding technique. We obtain state-of-the-art results for ANN search on four public image datasets and the promising results of content-based image retrieval further validate the efficacy of our proposed method.Comment: 12 page

    Irregular Variable Length Coding

    Get PDF
    In this thesis, we introduce Irregular Variable Length Coding (IrVLC) and investigate its applications, characteristics and performance in the context of digital multimedia broadcast telecommunications. During IrVLC encoding, the multimedia signal is represented using a sequence of concatenated binary codewords. These are selected from a codebook, comprising a number of codewords, which, in turn, comprise various numbers of bits. However, during IrVLC encoding, the multimedia signal is decomposed into particular fractions, each of which is represented using a different codebook. This is in contrast to regular Variable Length Coding (VLC), in which the entire multimedia signal is encoded using the same codebook. The application of IrVLCs to joint source and channel coding is investigated in the context of a video transmission scheme. Our novel video codec represents the video signal using tessellations of Variable-Dimension Vector Quantisation (VDVQ) tiles. These are selected from a codebook, comprising a number of tiles having various dimensions. The selected tessellation of VDVQ tiles is signalled using a corresponding sequence of concatenated codewords from a Variable Length Error Correction (VLEC) codebook. This VLEC codebook represents a specific joint source and channel coding case of VLCs, which facilitates both compression and error correction. However, during video encoding, only particular combinations of the VDVQ tiles will perfectly tessellate, owing to their various dimensions. As a result, only particular sub-sets of the VDVQ codebook and, hence, of the VLEC codebook may be employed to convey particular fractions of the video signal. Therefore, our novel video codec can be said to employ IrVLCs. The employment of IrVLCs to facilitate Unequal Error Protection (UEP) is also demonstrated. This may be applied when various fractions of the source signal have different error sensitivities, as is typical in audio, speech, image and video signals, for example. Here, different VLEC codebooks having appropriately selected error correction capabilities may be employed to encode the particular fractions of the source signal. This approach may be expected to yield a higher reconstruction quality than equal protection in cases where the various fractions of the source signal have different error sensitivities. Finally, this thesis investigates the application of IrVLCs to near-capacity operation using EXtrinsic Information Transfer (EXIT) chart analysis. Here, a number of component VLEC codebooks having different inverted EXIT functions are employed to encode particular fractions of the source symbol frame. We show that the composite inverted IrVLC EXIT function may be obtained as a weighted average of the inverted component VLC EXIT functions. Additionally, EXIT chart matching is employed to shape the inverted IrVLC EXIT function to match the EXIT function of a serially concatenated inner channel code, creating a narrow but still open EXIT chart tunnel. In this way, iterative decoding convergence to an infinitesimally low probability of error is facilitated at near-capacity channel SNRs

    Image Compression Techniques: A Survey in Lossless and Lossy algorithms

    Get PDF
    The bandwidth of the communication networks has been increased continuously as results of technological advances. However, the introduction of new services and the expansion of the existing ones have resulted in even higher demand for the bandwidth. This explains the many efforts currently being invested in the area of data compression. The primary goal of these works is to develop techniques of coding information sources such as speech, image and video to reduce the number of bits required to represent a source without significantly degrading its quality. With the large increase in the generation of digital image data, there has been a correspondingly large increase in research activity in the field of image compression. The goal is to represent an image in the fewest number of bits without losing the essential information content within. Images carry three main type of information: redundant, irrelevant, and useful. Redundant information is the deterministic part of the information, which can be reproduced without loss from other information contained in the image. Irrelevant information is the part of information that has enormous details, which are beyond the limit of perceptual significance (i.e., psychovisual redundancy). Useful information, on the other hand, is the part of information, which is neither redundant nor irrelevant. Human usually observes decompressed images. Therefore, their fidelities are subject to the capabilities and limitations of the Human Visual System. This paper provides a survey on various image compression techniques, their limitations, compression rates and highlights current research in medical image compression

    Large-scale interactive exploratory visual search

    Get PDF
    Large scale visual search has been one of the challenging issues in the era of big data. It demands techniques that are not only highly effective and efficient but also allow users conveniently express their information needs and refine their intents. In this thesis, we focus on developing an exploratory framework for large scale visual search. We also develop a number of enabling techniques in this thesis, including compact visual content representation for scalable search, near duplicate video shot detection, and action based event detection. We propose a novel scheme for extremely low bit rate visual search, which sends compressed visual words consisting of vocabulary tree histogram and descriptor orientations rather than descriptors. Compact representation of video data is achieved through identifying keyframes of a video which can also help users comprehend visual content efficiently. We propose a novel Bag-of-Importance model for static video summarization. Near duplicate detection is one of the key issues for large scale visual search, since there exist a large number nearly identical images and videos. We propose an improved near-duplicate video shot detection approach for more effective shot representation. Event detection has been one of the solutions for bridging the semantic gap in visual search. We particular focus on human action centred event detection. We propose an enhanced sparse coding scheme to model human actions. Our proposed approach is able to significantly reduce computational cost while achieving recognition accuracy highly comparable to the state-of-the-art methods. At last, we propose an integrated solution for addressing the prime challenges raised from large-scale interactive visual search. The proposed system is also one of the first attempts for exploratory visual search. It provides users more robust results to satisfy their exploring experiences

    Digital image compression

    Get PDF

    Improved Encoding for Compressed Textures

    Get PDF
    For the past few decades, graphics hardware has supported mapping a two dimensional image, or texture, onto a three dimensional surface to add detail during rendering. The complexity of modern applications using interactive graphics hardware have created an explosion of the amount of data needed to represent these images. In order to alleviate the amount of memory required to store and transmit textures, graphics hardware manufacturers have introduced hardware decompression units into the texturing pipeline. Textures may now be stored as compressed in memory and decoded at run-time in order to access the pixel data. In order to encode images to be used with these hardware features, many compression algorithms are run offline as a preprocessing step, often times the most time-consuming step in the asset preparation pipeline. This research presents several techniques to quickly serve compressed texture data. With the goal of interactive compression rates while maintaining compression quality, three algorithms are presented in the class of endpoint compression formats. The first uses intensity dilation to estimate compression parameters for low-frequency signal-modulated compressed textures and offers up to a 3X improvement in compression speed. The second, FasTC, shows that by estimating the final compression parameters, partition-based formats can choose an approximate partitioning and offer orders of magnitude faster encoding speed. The third, SegTC, shows additional improvement over selecting a partitioning by using a global segmentation to find the boundaries between image features. This segmentation offers an additional 2X improvement over FasTC while maintaining similar compressed quality. Also presented is a case study in using texture compression to benefit two dimensional concave path rendering. Compressing pixel coverage textures used for compositing yields both an increase in rendering speed and a decrease in storage overhead. Additionally an algorithm is presented that uses a single layer of indirection to adaptively select the block size compressed for each texture, giving a 2X increase in compression ratio for textures of mixed detail. Finally, a texture storage representation that is decoded at runtime on the GPU is presented. The decoded texture is still compressed for graphics hardware but uses 2X fewer bytes for storage and network bandwidth.Doctor of Philosoph

    Development of Some Efficient Lossless and Lossy Hybrid Image Compression Schemes

    Get PDF
    Digital imaging generates a large amount of data which needs to be compressed, without loss of relevant information, to economize storage space and allow speedy data transfer. Though both storage and transmission medium capacities have been continuously increasing over the last two decades, they dont match the present requirement. Many lossless and lossy image compression schemes exist for compression of images in space domain and transform domain. Employing more than one traditional image compression algorithms results in hybrid image compression techniques. Based on the existing schemes, novel hybrid image compression schemes are developed in this doctoral research work, to compress the images effectually maintaining the quality
    corecore