9 research outputs found
Scalable Coding of Video Objects
This paper provides a methodology to encode video objects in a scalable manner with regard to both content and quality. Content scalability and quality scalability have been identified as required features in order to support video coding across different environments. Following the object-based approach to coding video, we extend our previous work on motion-based segmentation by using a time recursive approach to segmenting image sequences and decomposing a video "shot" into its constituent objects. Our formulation of the segmentation problem enables us to design a codec in which the information (shape, texture and motion) pertaining to each video object is encoded independently of the other. The multiresolution wavelet decomposition used in encoding texture information is shown to be helpful in providing spatial scalability. Our codec design is also shown to be temporally scalable. This report was accepted for oral presentation at the IEEE International Symposium on Circuits & Systems, Monterey, Calif., May-June 1998
Live video streaming over packet networks and wireless channels
The transmission of live video over noisy channels requires very low end-to-end delay. Although automatic repeat request ensures lossless transmission, its usefulness to live video streaming is restricted to short connections because of the unbounded retransmission latency. An alternative is to use forward error correction (FEC). Since finding an optimal error protection strategy can be time expensive, FEC systems are commonly designed for the worst case condition of the channel, which limits the end-to-end performance. We study the suitability of two scalable FEC-based systems to the transmission of live video over packet networks. The first one uses Reed-Solomon codes and is appropriate for the Internet. The second one uses a product channel code and is appropriate for wireless channels. We show how fast and robust transmission can be achieved by exploiting a parametric model for the distortion-rate curve of the source coder and by using fast joint source-channel allocation algorithms. Experimental results for the 3D set partitioning in hierarchical tree video coder show that the systems have good reconstruction quality even in severe channel conditions. Finally, we compare the performance of the systems to the state-of-the-art for video transmission over the Internet. 1
Fast Random Access to Wavelet Compressed Volumetric Data Using Hashing
We present a new approach to lossy storage of the coefficients of wavelet transformed data. While it is common to store the coefficients of largest magnitude (and let all other coefficients be zero), we allow a slightly different set of coefficients to be stored. This brings into play a recently proposed hashing technique that allows space efficient storage and very efficient retrieval of coefficients. Our approach is applied to compression of volumetric data sets. For the ``Visible Man'' volume we obtain up to 80% improvement in compression ratio over previously suggested schemes. Further, the time for accessing a random voxel is quite competitive
Video transmission over wireless networks
Compressed video bitstream transmissions over wireless networks are addressed in this work. We first consider error control and power allocation for transmitting wireless video over CDMA networks in conjunction with multiuser detection. We map a layered video bitstream to several CDMA fading channels and inject multiple source/parity layers into each of these channels at the transmitter. We formulate a combined optimization problem and give the optimal joint rate and power allocation for each of linear minimum mean-square error (MMSE) multiuser detector in the uplink and two types of blind linear MMSE detectors, i.e., the direct-matrix-inversion (DMI) blind detector and the subspace blind detector, in the downlink. We then present a multiple-channel video transmission scheme in wireless CDMA networks over multipath fading channels. For a given budget on the available bandwidth and total transmit power, the transmitter determines the optimal power allocations and the optimal transmission rates among multiple CDMA channels, as well as the optimal product channel code rate allocation. We also make use of results on the large-system CDMA performance for various multiuser receivers in multipath fading channels. We employ a fast joint source-channel coding algorithm to obtain the optimal product channel code structure. Finally, we propose an end-to-end architecture for multi-layer progressive video delivery over space-time differentially coded orthogonal frequency division multiplexing (STDC-OFDM) systems. We propose to use progressive joint source-channel coding to generate operational transmission distortion-power-rate (TD-PR) surfaces. By extending the rate-distortion function in source coding to the TD-PR surface in joint source-channel coding, our work can use the ??equal slope?? argument to effectively solve the transmission rate allocation problem as well as the transmission power allocation problem for multi-layer video transmission. It is demonstrated through simulations that as the wireless channel conditions change, these proposed schemes can scale the video streams and transport the scaled video streams to receivers with a smooth change of perceptual quality
On unifying sparsity and geometry for image-based 3D scene representation
Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding
Attractor image coding with low blocking effects.
by Ho, Hau Lai.Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (leaves 97-103).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview of Attractor Image Coding --- p.2Chapter 1.2 --- Scope of Thesis --- p.3Chapter 2 --- Fundamentals of Attractor Coding --- p.6Chapter 2.1 --- Notations --- p.6Chapter 2.2 --- Mathematical Preliminaries --- p.7Chapter 2.3 --- Partitioned Iterated Function Systems --- p.10Chapter 2.3.1 --- Mathematical Formulation of the PIFS --- p.12Chapter 2.4 --- Attractor Coding using the PIFS --- p.16Chapter 2.4.1 --- Quadtree Partitioning --- p.18Chapter 2.4.2 --- Inclusion of an Orthogonalization Operator --- p.19Chapter 2.5 --- Coding Examples --- p.21Chapter 2.5.1 --- Evaluation Criterion --- p.22Chapter 2.5.2 --- Experimental Settings --- p.22Chapter 2.5.3 --- Results and Discussions --- p.23Chapter 2.6 --- Summary --- p.25Chapter 3 --- Attractor Coding with Adjacent Block Parameter Estimations --- p.27Chapter 3.1 --- δ-Minimum Edge Difference --- p.29Chapter 3.1.1 --- Definition --- p.29Chapter 3.1.2 --- Theoretical Analysis --- p.31Chapter 3.2 --- Adjacent Block Parameter Estimation Scheme --- p.33Chapter 3.2.1 --- Joint Optimization --- p.34Chapter 3.2.2 --- Predictive Coding --- p.36Chapter 3.3 --- Algorithmic Descriptions of the Proposed Scheme --- p.39Chapter 3.4 --- Experimental Results --- p.40Chapter 3.5 --- Summary --- p.50Chapter 4 --- Attractor Coding using Lapped Partitioned Iterated Function Sys- tems --- p.51Chapter 4.1 --- Lapped Partitioned Iterated Function Systems --- p.53Chapter 4.1.1 --- Weighting Operator --- p.54Chapter 4.1.2 --- Mathematical Formulation of the LPIFS --- p.57Chapter 4.2 --- Attractor Coding using the LPIFS --- p.62Chapter 4.2.1 --- Choice of Weighting Operator --- p.64Chapter 4.2.2 --- Range Block Preprocessing --- p.69Chapter 4.2.3 --- Decoder Convergence Analysis --- p.73Chapter 4.3 --- Local Domain Block Searching --- p.74Chapter 4.3.1 --- Theoretical Foundation --- p.75Chapter 4.3.2 --- Local Block Searching Algorithm --- p.77Chapter 4.4 --- Experimental Results --- p.79Chapter 4.5 --- Summary --- p.90Chapter 5 --- Conclusion --- p.91Chapter 5.1 --- Original Contributions --- p.91Chapter 5.2 --- Subjects for Future Research --- p.92Chapter A --- Fundamental Definitions --- p.94Chapter B --- Appendix B --- p.96Bibliography --- p.9
Image coding using wavelets, interval wavelets and multi- layered wedgelets
Ph.DDOCTOR OF PHILOSOPH