64 research outputs found

    Quantized Overcomplete Expansions in R^N: Analysis, Synthesis, and Algorithms

    Get PDF
    Coefficient quantization has peculiar qualitative effects on representations of vectors in IR with respect to overcomplete sets of vectors. These effects are investigated in two settings: frame expansions (representations obtained by forming inner products with each element of the set) and matching pursuit expansions (approximations obtained by greedily forming linear combinations). In both cases, based on the concept of consistency, it is shown that traditional linear reconstruction methods are suboptimal, and better consistent reconstruction algorithms are given. The proposed consistent reconstruction algorithms were in each case implemented, and experimental results are included. For frame expansions, results are proven to bound distortion as a function of frame redundancy r and quantization step size for linear, consistent, and optimal reconstruction methods. Taken together, these suggest that optimal reconstruction methods will yield O(1=r ) mean-squared error (MSE), and that consistency is sufficient to insure this asymptotic behavior. A result on the asymptotic tightness of random frames is also proven. Applicability of quantized matching pursuit to lossy vector compression is explored. Experiments demonstrate the likelihood that a linear reconstruction is inconsistent, the MSE reduction obtained with a nonlinear (consistent) reconstruction algorithm, and generally competitive performance at low bit rates

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Estimation and Modeling Problems in Parametric Audio Coding

    Get PDF

    Visual Feature Learning

    Get PDF
    Categorization is a fundamental problem of many computer vision applications, e.g., image classification, pedestrian detection and face recognition. The robustness of a categorization system heavily relies on the quality of features, by which data are represented. The prior arts of feature extraction can be concluded in different levels, which, in a bottom up order, are low level features (e.g., pixels and gradients) and middle/high-level features (e.g., the BoW model and sparse coding). Low level features can be directly extracted from images or videos, while middle/high-level features are constructed upon low-level features, and are designed to enhance the capability of categorization systems based on different considerations (e.g., guaranteeing the domain-invariance and improving the discriminative power). This thesis focuses on the study of visual feature learning. Challenges that remain in designing visual features lie in intra-class variation, occlusions, illumination and view-point changes and insufficient prior knowledge. To address these challenges, I present several visual feature learning methods, where these methods cover the following sub-topics: (i) I start by introducing a segmentation-based object recognition system. (ii) When training data are insufficient, I seek data from other resources, which include images or videos in a different domain, actions captured from a different viewpoint and information in a different media form. In order to appropriately transfer such resources into the target categorization system, four transfer learning-based feature learning methods are presented in this section, where both cross-view, cross-domain and cross-modality scenarios are addressed accordingly. (iii) Finally, I present a random-forest based feature fusion method for multi-view action recognition

    Application and Theory of Multimedia Signal Processing Using Machine Learning or Advanced Methods

    Get PDF
    This Special Issue is a book composed by collecting documents published through peer review on the research of various advanced technologies related to applications and theories of signal processing for multimedia systems using ML or advanced methods. Multimedia signals include image, video, audio, character recognition and optimization of communication channels for networks. The specific contents included in this book are data hiding, encryption, object detection, image classification, and character recognition. Academics and colleagues who are interested in these topics will find it interesting to read

    Strategies for enhancing DC gain and settling performance of amplifiers

    Get PDF
    The operational amplifier (op amp) is one of the most widely used and important building blocks in analog circuit design. High gain and high speed are two important properties of op amps because they determine the settling behavior of the op amps. As supply voltages decrease, the realization of high gain amplifiers with large Gain-Bandwidth-Products (GBW) has become challenging. The major focus in this dissertation is on the negative output impedance gain enhancement technique. The negative impedance gain enhancement technique offers potential for achieving very high gain and energy-efficient fast settling and is low-voltage compatible. Misconceptions that have limited the practical adoption of this gain enhancement technique are discussed. A new negative conductance gain enhancement technique was proposed. The proposed circuit generates a negative conductance with matching requirements for achieving very high DC gain that are less stringent than those for existing -g m gain enhancement schemes. The proposed circuit has potential for precise digital control of a very large DC gain. A prototype fully differential CMOS operational amplifier was designed and fabricated based on the proposed gain enhancement technique. Experimental results which showed a DC gain of 85dB and an output swing of 876mVp-p validated the fundamental performance characteristics of this technique. In a separate section, a new amplifier architecture with bandpass feedforward compensation is presented. It is shown that a bandpass feedforward path can be used to substantially extend the unity-gain-frequency of an operational amplifier. Simulation results predict significant improvements in rise time and settling performance and show that the bandpass compensation scheme is reasonably robust. In the final section, a new technique for asynchronous data recovery based upon using a delay line in the incoming data path is introduced. The proposed data recovery system is well suited for tight tolerance channels and coding systems supporting standards that limit the maximum number of consecutive 0\u27s and 1\u27s in a data stream. This system does not require clock recovery, suffers no loss of data during acquisition, has a reduced sensitivity to jitter in the incoming data and does not exhibit jitter enhancement associated with VCO tracking in a PLL

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link
    • …
    corecore