175 research outputs found

    Low Bit-rate Color Video Compression using Multiwavelets in Three Dimensions

    Get PDF
    In recent years, wavelet-based video compressions have become a major focus of research because of the advantages that it provides. More recently, a growing thrust of studies explored the use of multiple scaling functions and multiple wavelets with desirable properties in various fields, from image de-noising to compression. In term of data compression, multiple scaling functions and wavelets offer a greater flexibility in coefficient quantization at high compression ratio than a comparable single wavelet. The purpose of this research is to investigate the possible improvement of scalable wavelet-based color video compression at low bit-rates by using three-dimensional multiwavelets. The first part of this work included the development of the spatio-temporal decomposition process for multiwavelets and the implementation of an efficient 3-D SPIHT encoder/decoder as a common platform for performance evaluation of two well-known multiwavelet systems against a comparable single wavelet in low bitrate color video compression. The second part involved the development of a motion-compensated 3-D compression codec and a modified SPIHT algorithm designed specifically for this codec by incorporating an advantage in the design of 2D SPIHT into the 3D SPIHT coder. In an experiment that compared their performances, the 3D motion-compensated codec with unmodified 3D SPIHT had gains of 0.3dB to 4.88dB over regular 2D wavelet-based motion-compensated codec using 2D SPIHT in the coding of 19 endoscopy sequences at 1/40 compression ratio. The effectiveness of the modified SPIHT algorithm was verified by the results of a second experiment in which it was used to re-encode 4 of the 19 sequences with lowest performance gains and improved them by 0.5dB to 1.0dB. The last part of the investigation examined the effect of multiwavelet packet on 3-D video compression as well as the effects of coding multiwavelet packets based on the frequency order and energy content of individual subbands

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Toward sparse and geometry adapted video approximations

    Get PDF
    Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion information) as well as spatial geometry. Clearly, most of past and present strategies used to represent video signals do not exploit properly its spatial geometry. Similarly to the case of images, a very interesting approach seems to be the decomposition of video using large over-complete libraries of basis functions able to represent salient geometric features of the signal. In the framework of video, these features should model 2D geometric video components as well as their temporal evolution, forming spatio-temporal 3D geometric primitives. Through this PhD dissertation, different aspects on the use of adaptivity in video representation are studied looking toward exploiting both aspects of video: its piecewise nature and the geometry. The first part of this work studies the use of localized temporal adaptivity in subband video coding. This is done considering two transformation schemes used for video coding: 3D wavelet representations and motion compensated temporal filtering. A theoretical R-D analysis as well as empirical results demonstrate how temporal adaptivity improves coding performance of moving edges in 3D transform (without motion compensation) based video coding. Adaptivity allows, at the same time, to equally exploit redundancy in non-moving video areas. The analogy between motion compensated video and 1D piecewise-smooth signals is studied as well. This motivates the introduction of local length adaptivity within frame-adaptive motion compensated lifted wavelet decompositions. This allows an optimal rate-distortion performance when video motion trajectories are shorter than the transformation "Group Of Pictures", or when efficient motion compensation can not be ensured. After studying temporal adaptivity, the second part of this thesis is dedicated to understand the fundamentals of how can temporal and spatial geometry be jointly exploited. This work builds on some previous results that considered the representation of spatial geometry in video (but not temporal, i.e, without motion). In order to obtain flexible and efficient (sparse) signal representations, using redundant dictionaries, the use of highly non-linear decomposition algorithms, like Matching Pursuit, is required. General signal representation using these techniques is still quite unexplored. For this reason, previous to the study of video representation, some aspects of non-linear decomposition algorithms and the efficient decomposition of images using Matching Pursuits and a geometric dictionary are investigated. A part of this investigation concerns the study on the influence of using a priori models within approximation non-linear algorithms. Dictionaries with a high internal coherence have some problems to obtain optimally sparse signal representations when used with Matching Pursuits. It is proved, theoretically and empirically, that inserting in this algorithm a priori models allows to improve the capacity to obtain sparse signal approximations, mainly when coherent dictionaries are used. Another point discussed in this preliminary study, on the use of Matching Pursuits, concerns the approach used in this work for the decompositions of video frames and images. The technique proposed in this thesis improves a previous work, where authors had to recur to sub-optimal Matching Pursuit strategies (using Genetic Algorithms), given the size of the functions library. In this work the use of full search strategies is made possible, at the same time that approximation efficiency is significantly improved and computational complexity is reduced. Finally, a priori based Matching Pursuit geometric decompositions are investigated for geometric video representations. Regularity constraints are taken into account to recover the temporal evolution of spatial geometric signal components. The results obtained for coding and multi-modal (audio-visual) signal analysis, clarify many unknowns and show to be promising, encouraging to prosecute research on the subject

    Joint coding/decoding techniques and diversity techniques for video and HTML transmission over wireless point/multipoint: a survey

    Get PDF
    I. Introduction The concomitant developments of the Internet, which offers to its users always larger and more evolved contents (from HTML (HyperText Markup Language) files to multimedia applications), and of wireless systems and handhelds integrating them, have progressively convinced a fair share of people of the interest to always be connected. Still, constraints of heterogeneity, reliability, quality and delay over the transmission channels are generally imposed to fulfill the requirements of these new needs and their corresponding economical goals. This implies different theoretical and practical challenges for the digital communications community of the present time. This paper presents a survey of the different techniques existing in the domain of HTML and video stream transmission over erroneous or lossy channels. In particular, the existing techniques on joint source and channel coding and decoding for multimedia or HTML applications are surveyed, as well as the related problems of streaming and downloading files over an IP mobile link. Finally, various diversity techniques that can be considered for such links, from antenna diversity to coding diversity, are presented...L’engouement du grand public pour les applications multimédia sans fil ne cesse de croître depuis le développement d’Internet. Des contraintes d’hétérogénéité de canaux de transmission, de fiabilité, de qualité et de délai sont généralement exigées pour satisfaire les nouveaux besoins applicatifs entraînant ainsi des enjeux économiques importants. À l’heure actuelle, il reste encore un certain nombre de défis pratiques et théoriques lancés par les chercheurs de la communauté des communications numériques. C’est dans ce cadre que s’inscrit le panorama présenté ici. Cet article présente d’une part un état de l’art sur les principales techniques de codage et de décodage conjoint développées dans la littérature pour des applications multimédia de type téléchargement et diffusion de contenu sur lien mobile IP. Sont tout d’abord rappelées des notions fondamentales des communications numériques à savoir le codage de source, le codage de canal ainsi que les théorèmes de Shannon et leurs principales limitations. Les techniques de codage décodage conjoint présentées dans cet article concernent essentiellement celles développées pour des schémas de codage de source faisant intervenir des codes à longueur variable (CLV) notamment les codes d’Huffman, arithmétiques et les codes entropiques universels de type Lempel-Ziv (LZ). Faisant face au problème de la transmission de données (Hypertext Markup Language (HTML) et vidéo) sur un lien sans fil, cet article présente d’autre part un panorama de techniques de diversités plus ou moins complexes en vue d’introduire le nouveau système à multiples antennes d’émission et de réception

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

    Intelligent Sensor Networks

    Get PDF
    In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

    Wavelet Theory

    Get PDF
    The wavelet is a powerful mathematical tool that plays an important role in science and technology. This book looks at some of the most creative and popular applications of wavelets including biomedical signal processing, image processing, communication signal processing, Internet of Things (IoT), acoustical signal processing, financial market data analysis, energy and power management, and COVID-19 pandemic measurements and calculations. The editor’s personal interest is the application of wavelet transform to identify time domain changes on signals and corresponding frequency components and in improving power amplifier behavior

    Proceedings of the Mobile Satellite Conference

    Get PDF
    A satellite-based mobile communications system provides voice and data communications to mobile users over a vast geographic area. The technical and service characteristics of mobile satellite systems (MSSs) are presented and form an in-depth view of the current MSS status at the system and subsystem levels. Major emphasis is placed on developments, current and future, in the following critical MSS technology areas: vehicle antennas, networking, modulation and coding, speech compression, channel characterization, space segment technology and MSS experiments. Also, the mobile satellite communications needs of government agencies are addressed, as is the MSS potential to fulfill them
    • …
    corecore