310 research outputs found
Fractal image compression and the self-affinity assumption : a stochastic signal modelling perspective
Bibliography: p. 208-225.Fractal image compression is a comparatively new technique which has gained considerable attention in the popular technical press, and more recently in the research literature. The most significant advantages claimed are high reconstruction quality at low coding rates, rapid decoding, and "resolution independence" in the sense that an encoded image may be decoded at a higher resolution than the original. While many of the claims published in the popular technical press are clearly extravagant, it appears from the rapidly growing body of published research that fractal image compression is capable of performance comparable with that of other techniques enjoying the benefit of a considerably more robust theoretical foundation. . So called because of the similarities between the form of image representation and a mechanism widely used in generating deterministic fractal images, fractal compression represents an image by the parameters of a set of affine transforms on image blocks under which the image is approximately invariant. Although the conditions imposed on these transforms may be shown to be sufficient to guarantee that an approximation of the original image can be reconstructed, there is no obvious theoretical reason to expect this to represent an efficient representation for image coding purposes. The usual analogy with vector quantisation, in which each image is considered to be represented in terms of code vectors extracted from the image itself is instructive, but transforms the fundamental problem into one of understanding why this construction results in an efficient codebook. The signal property required for such a codebook to be effective, termed "self-affinity", is poorly understood. A stochastic signal model based examination of this property is the primary contribution of this dissertation. The most significant findings (subject to some important restrictions} are that "self-affinity" is not a natural consequence of common statistical assumptions but requires particular conditions which are inadequately characterised by second order statistics, and that "natural" images are only marginally "self-affine", to the extent that fractal image compression is effective, but not more so than comparable standard vector quantisation techniques
Optimal Multiresolution Quantization for Broadcast Channels with Random Index Assignment
Shannon's classical separation result holds only in the limit of infinite source code dimension and infinite channel code block length. In addition, Shannon theory does not address the design of good source codes when the probability of channel error is nonzero, which is inevitable for finite-length channel codes. Thus, for practical systems, a joint source and channel code design could improve performance for finite dimension source code and finite block length channel code, as well as complexity and delay.
Consider a multicast system over a broadcast channel, where different end users typically have different capacities. To support such user or capacity diversity, it is desirable to encode the source to be broadcasted into a scalable bit stream along which multiple resolutions of the source can be reconstructed progressively from left to right. Such source coding technique is called multiresolution source coding. In wireless communications, joint source channel coding (JSCC) has attracted wide attention due to its adaptivity to time-varying channels. However, there are few works on joint source channel coding for network multicast, especially for the optimal source coding over broadcast channels.
In this work, we aim at designing and analyzing the optimal multiresolution vector quantization (MRVQ) in conjunction with the subsequent broadcast channel over which the coded scalable bit stream would be transmitted. By adopting random index assignment (RIA) to link MRVQ for the source with superposition coding for the broadcast channel, we establish a closed-form formula of end-to-end distortion for a tandem system of MRVQ and a broadcast channel. From this formula we analyze the intrinsic structure of end-to-end distortion (EED) in a communication system and derive two necessary conditions for optimal multiresolution vector quantization over broadcast channels with random index assignment. According to the two necessary conditions, we propose a greedy iterative algorithm for jointly designed MRVQ with channel conditions, which depends on the channel only through several types of average channel error probabilities rather than the complete knowledge of the channel. Experiments show that MRVQ designed by the proposed algorithm significantly outperforms conventional MRVQ designed without channel information.
By building an closed-form formula for the weighted EED with RIA, it also makes the computational complexity incurred during the performance analysis feasible. In comparison with MRVQ design for a fixed index assignment, the computation complexity for quantization design is significantly reduced by using random index assignment. In addition, simulations indicate that our proposed algorithm shows better robustness against channel mismatch than MRVQ design with a fixed index assignment, simply due to the nature of using only the average channel information. Therefore, we conclude that our proposed algorithm is more appropriate in both wireless communications and applications where the complete knowledge of the channel is hard to obtain.
Furthermore, we propose two novel algorithms for MRVQ over broadcast channels. One aims to optimize the two corresponding quantizers at two layers alternatively and iteratively, and the other applies under the constraint that each encoding cell is convex and contains the reconstruction point. Finally, we analyze the asymptotic performance of weighted EED for the optimal joint MRVQ. The asymptotic result provides a theoretically achievable quantizer performance level and sheds light on the design of the optimal MRVQ over broadcast channel from a different aspect
Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large-Scale Datasets
The technological developments of the last twenty years are leading the world to a new era. The invention of the internet, mobile phones and smart devices are resulting in an exponential increase in data. As the data is growing every day, finding similar patterns or matching samples to a query is no longer a simple task because of its computational costs and storage limitations. Special signal processing techniques are required in order to handle the growth in data, as simply adding more and more computers cannot keep up.Nearest neighbor search, or similarity search, proximity search or near item search is the problem of finding an item that is nearest or most similar to a query according to a distance or similarity measure. When the reference set is very large, or the distance or similarity calculation is complex, performing the nearest neighbor search can be computationally demanding. Considering today’s ever-growing datasets, where the cardinality of samples also keep increasing, a growing interest towards approximate methods has emerged in the research community.Vector Quantization for Approximate Nearest Neighbor Search (VQ for ANN) has proven to be one of the most efficient and successful methods targeting the aforementioned problem. It proposes to compress vectors into binary strings and approximate the distances between vectors using look-up tables. With this approach, the approximation of distances is very fast, while the storage space requirement of the dataset is minimized thanks to the extreme compression levels. The distance approximation performance of VQ for ANN has been shown to be sufficiently well for retrieval and classification tasks demonstrating that VQ for ANN techniques can be a good replacement for exact distance calculation methods.This thesis contributes to VQ for ANN literature by proposing five advanced techniques, which aim to provide fast and efficient approximate nearest neighbor search on very large-scale datasets. The proposed methods can be divided into two groups. The first group consists of two techniques, which propose to introduce subspace clustering to VQ for ANN. These methods are shown to give the state-of-the-art performance according to tests on prevalent large-scale benchmarks. The second group consists of three methods, which propose improvements on residual vector quantization. These methods are also shown to outperform their predecessors. Apart from these, a sixth contribution in this thesis is a demonstration of VQ for ANN in an application of image classification on large-scale datasets. It is shown that a k-NN classifier based on VQ for ANN performs on par with the k-NN classifiers, but requires much less storage space and computations
Semantic and effective communications
Shannon and Weaver categorized communications into three levels of problems: the technical problem, which tries to answer the question "how accurately can the symbols of communication be transmitted?"; the semantic problem, which asks the question "how precisely do the transmitted symbols convey the desired meaning?"; the effectiveness problem, which strives to answer the question "how effectively does the received meaning affect conduct in the desired way?". Traditionally, communication technologies mainly addressed the technical problem, ignoring the semantics or the effectiveness problems.
Recently, there has been increasing interest to address the higher level semantic and effectiveness problems, with proposals ranging from semantic to goal oriented communications. In this thesis, we propose to formulate the semantic problem as a joint source-channel coding (JSCC) problem and the effectiveness problem as a multi-agent partially observable Markov decision process (MA-POMDP). As such, for the semantic problem, we propose DeepWiVe, the first-ever end-to-end JSCC video transmission scheme that leverages the power of deep neural networks (DNNs) to directly map video signals to channel symbols, combining video compression, channel coding, and modulation steps into a single neural transform. We also further show that it is possible to use predefined constellation designs as well as secure the physical layer communication against eavesdroppers for deep learning (DL) driven JSCC schemes, making such schemes much more viable for deployment in the real world.
For the effectiveness problem, we propose a novel formulation by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a MA-POMDP, in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively'' over a noisy channel. Moreover, we show that this framework generalizes both the semantic and technical problems. In both instances, we show that the resultant communication scheme is superior to one where the communication is considered separately from the underlying semantic or goal of the problem.Open Acces
Coding and Signal Processing for Secure Wireless Communication
Wireless communication networks are widely deployed today and the networks are used in many applications which require that the data transmitted be secure. Due to the open nature of wireless systems, it is important to have a fundamental understanding of coding schemes that allow for simultaneously secure and reliable transmission. The information theoretic approach is able to give us this fundamental insight into the nature of the coding schemes required for security. The security issue is approached by focusing on the confidentiality of message transmission and reception at the physical layer. The goal is to design coding and signal processing schemes that provide security, in the information theoretic sense. In so doing, we are able to prove the simultaneously secure and reliable transmission rates for different network building blocks. The multi-receiver broadcast channel is an important network building block, where the rate region for the channel without security constraints is still unknown. In the thesis this channel is investigated with security constraints, and the secure and reliable rates are derived for the proposed coding scheme using a random coding argument. Cooperative relaying is next applied to the wiretap channel, the fundamental physical layer model for the communication security problem, and signal processing techniques are used to show that the secure rate can be improved in situations where the secure rate was small due to the eavesdropper enjoying a more favorable channel condition compared to the legitimate receiver. Finally, structured lattice codes are used in the wiretap channel instead of unstructured random codes, used in the vast majority of the work so far. We show that lattice coding and decoding can achieve the secrecy rate of the Gaussian wiretap channel; this is an important step towards realizing practical, explicit codes for the wiretap channel
Pattern Recognition
A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition
Non-Convex and Geometric Methods for Tomography and Label Learning
Data labeling is a fundamental problem of mathematical data analysis in which each data point is assigned exactly one single label (prototype) from a finite predefined set. In this thesis we study two challenging extensions, where either the input data cannot be observed directly or prototypes are not available beforehand.
The main application of the first setting is discrete tomography. We propose several non-convex variational as well as smooth geometric approaches to joint image label assignment and reconstruction from indirect measurements with known prototypes. In particular, we consider spatial regularization of assignments, based on the KL-divergence, which takes into account the smooth geometry of discrete probability distributions endowed with the Fisher-Rao (information) metric, i.e. the assignment manifold. Finally, the geometric point of view leads to a smooth flow evolving on a Riemannian submanifold including the tomographic projection constraints directly into the geometry of assignments. Furthermore we investigate corresponding implicit numerical schemes which amount to solving a sequence of convex problems.
Likewise, for the second setting, when the prototypes are absent, we introduce and study a smooth dynamical system for unsupervised data labeling which evolves by geometric integration on the assignment manifold. Rigorously abstracting from ``data-label'' to ``data-data'' decisions leads to interpretable low-rank data representations, which themselves are parameterized by label assignments. The resulting self-assignment flow simultaneously performs learning of latent prototypes in the very same framework while they are used for inference. Moreover, a single parameter, the scale of regularization in terms of spatial context, drives the entire process. By smooth geodesic interpolation between different normalizations of self-assignment matrices on the positive definite matrix manifold, a one-parameter family of self-assignment flows is defined. Accordingly, the proposed approach can be characterized from different viewpoints such as discrete optimal transport, normalized spectral cuts and combinatorial optimization by completely positive factorizations, each with additional built-in spatial regularization
- …