1,227 research outputs found
Variable Block Size Motion Compensation In The Redundant Wavelet Domain
Video is one of the most powerful forms of multimedia because of the extensive information it delivers. Video sequences are highly correlated both temporally and spatially, a fact which makes the compression of video possible. Modern video systems employ motion estimation and motion compensation (ME/MC) to de-correlate a video sequence temporally. ME/MC forms a prediction of the current frame using the frames which have been already encoded. Consequently, one needs to transmit the corresponding residual image instead of the original frame, as well as a set of motion vectors which describe the scene motion as observed at the encoder. The redundant wavelet transform (RDWT) provides several advantages over the conventional wavelet transform (DWT). The RDWT overcomes the shift invariant problem in DWT. Moreover, RDWT retains all the phase information of wavelet coefficients and provides multiple prediction possibilities for ME/MC in wavelet domain. The general idea of variable size block motion compensation (VSBMC) technique is to partition a frame in such a way that regions with uniform translational motions are divided into larger blocks while those containing complicated motions into smaller blocks, leading to an adaptive distribution of motion vectors (MV) across the frame. The research proposed new adaptive partitioning schemes and decision criteria in RDWT that utilize more effectively the motion content of a frame in terms of various block sizes. The research also proposed a selective subpixel accuracy algorithm for the motion vector using a multiband approach. The selective subpixel accuracy reduces the computations produced by the conventional subpixel algorithm while maintaining the same accuracy. In addition, the method of overlapped block motion compensation (OBMC) is used to reduce blocking artifacts. Finally, the research extends the applications of the proposed VSBMC to the 3D video sequences. The experimental results obtained here have shown that VSBMC in the RDWT domain can be a powerful tool for video compression
๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ์ํ ๋ค์ค ๋ฒกํฐ ๊ธฐ๋ฐ์ MEMC ๋ฐ ์ฌ์ธต CNN
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2019. 2. ์ดํ์ฌ.Block-based hierarchical motion estimations are widely used and are successful in generating high-quality interpolation. However, it still fails in the motion estimation of small objects when a background region moves in a different direction. This is because the motion of small objects is neglected by the down-sampling and over-smoothing operations at the top level of image pyramids in the maximum a posterior (MAP) method. Consequently, the motion vector of small objects cannot be detected at the bottom level, and therefore, the small objects often appear deformed in an interpolated frame. This thesis proposes a novel algorithm that preserves the motion vector of the small objects by adding a secondary motion vector candidate that represents the movement of the small objects. This additional candidate is always propagated from the top to the bottom layers of the image pyramid. Experimental results demonstrate that the intermediate frame interpolated by the proposed algorithm significantly improves the visual quality when compared with conventional MAP-based frame interpolation.
In motion compensated frame interpolation, a repetition pattern in an image makes it difficult to derive an accurate motion vector because multiple similar local minima exist in the search space of the matching cost for motion estimation. In order to improve the accuracy of motion estimation in a repetition region, this thesis attempts a semi-global approach that exploits both local and global characteristics of a repetition region. A histogram of the motion vector candidates is built by using a voter based voting system that is more reliable than an elector based voting system. Experimental results demonstrate that the proposed method significantly outperforms the previous local approach in term of both objective peak signal-to-noise ratio (PSNR) and subjective visual quality.
In video frame interpolation or motion-compensated frame rate up-conversion (MC-FRUC), motion compensation along unidirectional motion trajectories directly causes overlaps and holes issues. To solve these issues, this research presents a new algorithm for bidirectional motion compensated frame interpolation. Firstly, the proposed method generates bidirectional motion vectors from two unidirectional motion vector fields (forward and backward) obtained from the unidirectional motion estimations. It is done by projecting the forward and backward motion vectors into the interpolated frame. A comprehensive metric as an extension of the distance between a projected block and an interpolated block is proposed to compute weighted coefficients in the case when the interpolated block has multiple projected ones. Holes are filled based on vector median filter of non-hole available neighbor blocks. The proposed method outperforms existing MC-FRUC methods and removes block artifacts significantly.
Video frame interpolation with a deep convolutional neural network (CNN) is also investigated in this thesis. Optical flow and video frame interpolation are considered as a chicken-egg problem such that one problem affects the other and vice versa. This thesis presents a stack of networks that are trained to estimate intermediate optical flows from the very first intermediate synthesized frame and later the very end interpolated frame is generated by the second synthesis network that is fed by stacking the very first one and two learned intermediate optical flows based warped frames. The primary benefit is that it glues two problems into one comprehensive framework that learns altogether by using both an analysis-by-synthesis technique for optical flow estimation and vice versa, CNN kernels based synthesis-by-analysis. The proposed network is the first attempt to bridge two branches of previous approaches, optical flow based synthesis and CNN kernels based synthesis into a comprehensive network. Experiments are carried out with various challenging datasets, all showing that the proposed network outperforms the state-of-the-art methods with significant margins for video frame interpolation and the estimated optical flows are accurate for challenging movements. The proposed deep video frame interpolation network to post-processing is applied to the improvement of the coding efficiency of the state-of-art video compress standard, HEVC/H.265 and experimental results prove the efficiency of the proposed network.๋ธ๋ก ๊ธฐ๋ฐ ๊ณ์ธต์ ์์ง์ ์ถ์ ์ ๊ณ ํ์ง์ ๋ณด๊ฐ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์์ด ํญ๋๊ฒ ์ฌ์ฉ๋๊ณ ์๋ค. ํ์ง๋ง, ๋ฐฐ๊ฒฝ ์์ญ์ด ์์ง์ผ ๋, ์์ ๋ฌผ์ฒด์ ๋ํ ์์ง์ ์ถ์ ์ฑ๋ฅ์ ์ฌ์ ํ ์ข์ง ์๋ค. ์ด๋ maximum a posterior (MAP) ๋ฐฉ์์ผ๋ก ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ต์์ ๋ ๋ฒจ์์ down-sampling๊ณผ over-smoothing์ผ๋ก ์ธํด ์์ ๋ฌผ์ฒด์ ์์ง์์ด ๋ฌด์๋๊ธฐ ๋๋ฌธ์ด๋ค. ๊ฒฐ๊ณผ์ ์ผ๋ก ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ตํ์ ๋ ๋ฒจ์์ ์์ ๋ฌผ์ฒด์ ์์ง์ ๋ฒกํฐ๋ ๊ฒ์ถ๋ ์ ์์ด ๋ณด๊ฐ ์ด๋ฏธ์ง์์ ์์ ๋ฌผ์ฒด๋ ์ข
์ข
๋ณํ๋ ๊ฒ์ฒ๋ผ ๋ณด์ธ๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์์ ๋ฌผ์ฒด์ ์์ง์์ ๋ํ๋ด๋ 2์ฐจ ์์ง์ ๋ฒกํฐ ํ๋ณด๋ฅผ ์ถ๊ฐํ์ฌ ์์ ๋ฌผ์ฒด์ ์์ง์ ๋ฒกํฐ๋ฅผ ๋ณด์กดํ๋ ์๋ก์ด ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค. ์ถ๊ฐ๋ ์์ง์ ๋ฒกํฐ ํ๋ณด๋ ํญ์ ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ต์์์์ ์ตํ์ ๋ ๋ฒจ๋ก ์ ํ๋๋ค. ์คํ ๊ฒฐ๊ณผ๋ ์ ์๋ ์๊ณ ๋ฆฌ์ฆ์ ๋ณด๊ฐ ์์ฑ ํ๋ ์์ด ๊ธฐ์กด MAP ๊ธฐ๋ฐ ๋ณด๊ฐ ๋ฐฉ์์ผ๋ก ์์ฑ๋ ํ๋ ์๋ณด๋ค ์ด๋ฏธ์ง ํ์ง์ด ์๋นํ ํฅ์๋จ์ ๋ณด์ฌ์ค๋ค.
์์ง์ ๋ณด์ ํ๋ ์ ๋ณด๊ฐ์์, ์ด๋ฏธ์ง ๋ด์ ๋ฐ๋ณต ํจํด์ ์์ง์ ์ถ์ ์ ์ํ ์ ํฉ ์ค์ฐจ ํ์ ์ ๋ค์์ ์ ์ฌ local minima๊ฐ ์กด์ฌํ๊ธฐ ๋๋ฌธ์ ์ ํํ ์์ง์ ๋ฒกํฐ ์ ๋๋ฅผ ์ด๋ ต๊ฒ ํ๋ค. ๋ณธ ๋
ผ๋ฌธ์ ๋ฐ๋ณต ํจํด์์์ ์์ง์ ์ถ์ ์ ์ ํ๋๋ฅผ ํฅ์์ํค๊ธฐ ์ํด ๋ฐ๋ณต ์์ญ์ localํ ํน์ฑ๊ณผ globalํ ํน์ฑ์ ๋์์ ํ์ฉํ๋ semi-globalํ ์ ๊ทผ์ ์๋ํ๋ค. ์์ง์ ๋ฒกํฐ ํ๋ณด์ ํ์คํ ๊ทธ๋จ์ ์ ๊ฑฐ ๊ธฐ๋ฐ ํฌํ ์์คํ
๋ณด๋ค ์ ๋ขฐํ ์ ์๋ ์ ๊ถ์ ๊ธฐ๋ฐ ํฌํ ์์คํ
๊ธฐ๋ฐ์ผ๋ก ํ์ฑ๋๋ค. ์คํ ๊ฒฐ๊ณผ๋ ์ ์๋ ๋ฐฉ๋ฒ์ด ์ด์ ์ localํ ์ ๊ทผ๋ฒ๋ณด๋ค peak signal-to-noise ratio (PSNR)์ ์ฃผ๊ด์ ํ์ง ํ๋จ ๊ด์ ์์ ์๋นํ ์ฐ์ํจ์ ๋ณด์ฌ์ค๋ค.
๋น๋์ค ํ๋ ์ ๋ณด๊ฐ ๋๋ ์์ง์ ๋ณด์ ํ๋ ์์จ ์ํฅ ๋ณํ (MC-FRUC)์์, ๋จ๋ฐฉํฅ ์์ง์ ๊ถค์ ์ ๋ฐ๋ฅธ ์์ง์ ๋ณด์์ overlap๊ณผ hole ๋ฌธ์ ๋ฅผ ์ผ์ผํจ๋ค. ๋ณธ ์ฐ๊ตฌ์์ ์ด๋ฌํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์๋ฐฉํฅ ์์ง์ ๋ณด์ ํ๋ ์ ๋ณด๊ฐ์ ์ํ ์๋ก์ด ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค. ๋จผ์ , ์ ์๋ ๋ฐฉ๋ฒ์ ๋จ๋ฐฉํฅ ์์ง์ ์ถ์ ์ผ๋ก๋ถํฐ ์ป์ด์ง ๋ ๊ฐ์ ๋จ๋ฐฉํฅ ์์ง์ ์์ญ(์ ๋ฐฉ ๋ฐ ํ๋ฐฉ)์ผ๋ก๋ถํฐ ์๋ฐฉํฅ ์์ง์ ๋ฒกํฐ๋ฅผ ์์ฑํ๋ค. ์ด๋ ์ ๋ฐฉ ๋ฐ ํ๋ฐฉ ์์ง์ ๋ฒกํฐ๋ฅผ ๋ณด๊ฐ ํ๋ ์์ ํฌ์ํจ์ผ๋ก์จ ์ํ๋๋ค. ๋ณด๊ฐ๋ ๋ธ๋ก์ ์ฌ๋ฌ ๊ฐ์ ํฌ์๋ ๋ธ๋ก์ด ์๋ ๊ฒฝ์ฐ, ํฌ์๋ ๋ธ๋ก๊ณผ ๋ณด๊ฐ๋ ๋ธ๋ก ์ฌ์ด์ ๊ฑฐ๋ฆฌ๋ฅผ ํ์ฅํ๋ ๊ธฐ์ค์ด ๊ฐ์ค ๊ณ์๋ฅผ ๊ณ์ฐํ๊ธฐ ์ํด ์ ์๋๋ค. Hole์ hole์ด ์๋ ์ด์ ๋ธ๋ก์ vector median filter๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ์ฒ๋ฆฌ๋๋ค. ์ ์ ๋ฐฉ๋ฒ์ ๊ธฐ์กด์ MC-FRUC๋ณด๋ค ์ฑ๋ฅ์ด ์ฐ์ํ๋ฉฐ, ๋ธ๋ก ์ดํ๋ฅผ ์๋นํ ์ ๊ฑฐํ๋ค.
๋ณธ ๋
ผ๋ฌธ์์๋ CNN์ ์ด์ฉํ ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ๋ํด์๋ ๋ค๋ฃฌ๋ค. Optical flow ๋ฐ ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ํ ๊ฐ์ง ๋ฌธ์ ๊ฐ ๋ค๋ฅธ ๋ฌธ์ ์ ์ํฅ์ ๋ฏธ์น๋ chicken-egg ๋ฌธ์ ๋ก ๊ฐ์ฃผ๋๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ค๊ฐ optical flow ๋ฅผ ๊ณ์ฐํ๋ ๋คํธ์ํฌ์ ๋ณด๊ฐ ํ๋ ์์ ํฉ์ฑ ํ๋ ๋ ๊ฐ์ง ๋คํธ์ํฌ๋ก ์ด๋ฃจ์ด์ง ํ๋์ ๋คํธ์ํฌ ์คํ์ ๊ตฌ์กฐ๋ฅผ ์ ์ํ๋ค. The final ๋ณด๊ฐ ํ๋ ์์ ์์ฑํ๋ ๋คํธ์ํฌ์ ๊ฒฝ์ฐ ์ฒซ ๋ฒ์งธ ๋คํธ์ํฌ์ ์ถ๋ ฅ์ธ ๋ณด๊ฐ ํ๋ ์ ์ ์ค๊ฐ optical flow based warped frames์ ์
๋ ฅ์ผ๋ก ๋ฐ์์ ํ๋ ์์ ์์ฑํ๋ค. ์ ์๋ ๊ตฌ์กฐ์ ๊ฐ์ฅ ํฐ ํน์ง์ optical flow ๊ณ์ฐ์ ์ํ ํฉ์ฑ์ ์ํ ๋ถ์๋ฒ๊ณผ CNN ๊ธฐ๋ฐ์ ๋ถ์์ ์ํ ํฉ์ฑ๋ฒ์ ๋ชจ๋ ์ด์ฉํ์ฌ ํ๋์ ์ข
ํฉ์ ์ธ framework๋ก ๊ฒฐํฉํ์๋ค๋ ๊ฒ์ด๋ค. ์ ์๋ ๋คํธ์ํฌ๋ ๊ธฐ์กด์ ๋ ๊ฐ์ง ์ฐ๊ตฌ์ธ optical flow ๊ธฐ๋ฐ ํ๋ ์ ํฉ์ฑ๊ณผ CNN ๊ธฐ๋ฐ ํฉ์ฑ ํ๋ ์ ํฉ์ฑ๋ฒ์ ์ฒ์ ๊ฒฐํฉ์ํจ ๋ฐฉ์์ด๋ค. ์คํ์ ๋ค์ํ๊ณ ๋ณต์กํ ๋ฐ์ดํฐ ์
์ผ๋ก ์ด๋ฃจ์ด์ก์ผ๋ฉฐ, ๋ณด๊ฐ ํ๋ ์ quality ์ optical flow ๊ณ์ฐ ์ ํ๋ ์ธก๋ฉด์์ ๊ธฐ์กด์ state-of-art ๋ฐฉ์์ ๋นํด ์๋ฑํ ๋์ ์ฑ๋ฅ์ ๋ณด์๋ค. ๋ณธ ๋
ผ๋ฌธ์ ํ ์ฒ๋ฆฌ๋ฅผ ์ํ ์ฌ์ธต ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ ๋คํธ์ํฌ๋ ์ฝ๋ฉ ํจ์จ ํฅ์์ ์ํด ์ต์ ๋น๋์ค ์์ถ ํ์ค์ธ HEVC/H.265์ ์ ์ฉํ ์ ์์ผ๋ฉฐ, ์คํ ๊ฒฐ๊ณผ๋ ์ ์ ๋คํธ์ํฌ์ ํจ์จ์ฑ์ ์
์ฆํ๋ค.Abstract i
Table of Contents iv
List of Tables vii
List of Figures viii
Chapter 1. Introduction 1
1.1. Hierarchical Motion Estimation of Small Objects 2
1.2. Motion Estimation of a Repetition Pattern Region 4
1.3. Motion-Compensated Frame Interpolation 5
1.4. Video Frame Interpolation with Deep CNN 6
1.5. Outline of the Thesis 7
Chapter 2. Previous Works 9
2.1. Previous Works on Hierarchical Block-Based Motion Estimation 9
2.1.1.โMaximum a Posterior (MAP) Framework 10
2.1.2.Hierarchical Motion Estimation 12
2.2. Previous Works on Motion Estimation for a Repetition Pattern Region 13
2.3. Previous Works on Motion Compensation 14
2.4. Previous Works on Video Frame Interpolation with Deep CNN 16
Chapter 3. Hierarchical Motion Estimation for Small Objects 19
3.1. Problem Statement 19
3.2. The Alternative Motion Vector of High Cost Pixels 20
3.3. Modified Hierarchical Motion Estimation 23
3.4. Framework of the Proposed Algorithm 24
3.5. Experimental Results 25
3.5.1. Performance Analysis 26
3.5.2. Performance Evaluation 29
Chapter 4. Semi-Global Accurate Motion Estimation for a Repetition Pattern Region 32
4.1. Problem Statement 32
4.2. Objective Function and Constrains 33
4.3. Elector based Voting System 34
4.4. Voter based Voting System 36
4.5. Experimental Results 40
Chapter 5. Multiple Motion Vectors based Motion Compensation 44
5.1. Problem Statement 44
5.2. Adaptive Weighted Multiple Motion Vectors based Motion Compensation 45
5.2.1. One-to-Multiple Motion Vector Projection 45
5.2.2. A Comprehensive Metric as the Extension of Distance 48
5.3. Handling Hole Blocks 49
5.4. Framework of the Proposed Motion Compensated Frame Interpolation 50
5.5. Experimental Results 51
Chapter 6. Video Frame Interpolation with a Stack of Deep CNN 56
6.1. Problem Statement 56
6.2. The Proposed Network for Video Frame Interpolation 57
6.2.1. A Stack of Synthesis Networks 57
6.2.2. Intermediate Optical Flow Derivation Module 60
6.2.3. Warping Operations 62
6.2.4. Training and Loss Function 63
6.2.5. Network Architecture 64
6.2.6. Experimental Results 64
6.2.6.1. Frame Interpolation Evaluation 64
6.2.6.2. Ablation Experiments 77
6.3. Extension for Quality Enhancement for Compressed Videos Task 83
6.4. Extension for Improving the Coding Efficiency of HEVC based Low Bitrate Encoder 88
Chapter 7. Conclusion 94
References 97Docto
Automatic aerial target detection and tracking system in airborne FLIR images based on efficient target trajectory filtering
Common strategies for detection and tracking of aerial moving targets in airborne Forward-Looking Infrared
(FLIR) images offer accurate results in images composed by a non-textured sky. However, when cloud and
earth regions appear in the image sequence, those strategies result in an over-detection that increases very
significantly the false alarm rate. Besides, the airborne camera induces a global motion in the image sequence
that complicates even more detection and tracking tasks. In this work, an automatic detection and tracking
system with an innovative and efficient target trajectory filtering is presented. It robustly compensates the
global motion to accurately detect and track potential aerial targets. Their trajectories are analyzed by a curve
fitting technique to reliably validate real targets. This strategy allows to filter false targets with stationary or
erratic trajectories. The proposed system makes special emphasis in the use of low complexity video analysis
techniques to achieve real-time operation. Experimental results using real FLIR sequences show a dramatic
reduction of the false alarm rate, while maintaining the detection rate
DCT-based video frame-skipping transcoder
2002-2003 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe
Motion Estimation and Compensation in the Redundant Wavelet Domain
Despite being the prefered approach for still-image compression for nearly a decade, wavelet-based coding for video has been slow to emerge, due primarily to the fact that the shift variance of the discrete wavelet transform hinders motion estimation and compensation crucial to modern video coders. Recently it has been recognized that a redundant, or overcomplete, wavelet transform is shift invariant and thus permits motion prediction in the wavelet domain. In this dissertation, other uses for the redundancy of overcomplete wavelet transforms in video coding are explored. First, it is demonstrated that the redundant-wavelet domain facilitates the placement of an irregular triangular mesh to video images, thereby exploiting transform redundancy to implement geometries for motion estimation and compensation more general than the traditional block structure widely employed. As the second contribution of this dissertation, a new form of multihypothesis prediction, redundant wavelet multihypothesis, is presented. This new approach to motion estimation and compensation produces motion predictions that are diverse in transform phase to increase prediction accuracy. Finally, it is demonstrated that the proposed redundant-wavelet strategies complement existing advanced video-coding techniques and produce significant performance improvements in a battery of experimental results
Improved quality block-based low bit rate video coding.
The aim of this research is to develop algorithms for enhancing the subjective quality and coding efficiency of standard block-based video coders. In the past few years, numerous video coding standards based on motion-compensated block-transform structure have been established where block-based motion estimation is used for reducing the correlation between consecutive images and block transform is used for coding the resulting motion-compensated residual images. Due to the use of predictive differential coding and variable length coding techniques, the output data rate exhibits extreme fluctuations. A rate control algorithm is devised for achieving a stable output data rate. This rate control algorithm, which is essentially a bit-rate estimation algorithm, is then employed in a bit-allocation algorithm for improving the visual quality of the coded images, based on some prior knowledge of the images. Block-based hybrid coders achieve high compression ratio mainly due to the employment of a motion estimation and compensation stage in the coding process. The conventional bit-allocation strategy for these coders simply assigns the bits required by the motion vectors and the rest to the residual image. However, at very low bit-rates, this bit-allocation strategy is inadequate as the motion vector bits takes up a considerable portion of the total bit-rate. A rate-constrained selection algorithm is presented where an analysis-by-synthesis approach is used for choosing the best motion vectors in term of resulting bit rate and image quality. This selection algorithm is then implemented for mode selection. A simple algorithm based on the above-mentioned bit-rate estimation algorithm is developed for the latter to reduce the computational complexity. For very low bit-rate applications, it is well-known that block-based coders suffer from blocking artifacts. A coding mode is presented for reducing these annoying artifacts by coding a down-sampled version of the residual image with a smaller quantisation step size. Its applications for adaptive source/channel coding and for coding fast changing sequences are examined
A family of stereoscopic image compression algorithms using wavelet transforms
With the standardization of JPEG-2000, wavelet-based image and video
compression technologies are gradually replacing the popular DCT-based methods. In
parallel to this, recent developments in autostereoscopic display technology is now
threatening to revolutionize the way in which consumers are used to enjoying the
traditional 2-D display based electronic media such as television, computer and
movies. However, due to the two-fold bandwidth/storage space requirement of
stereoscopic imaging, an essential requirement of a stereo imaging system is efficient
data compression.
In this thesis, seven wavelet-based stereo image compression algorithms are
proposed, to take advantage of the higher data compaction capability and better
flexibility of wavelets. [Continues.
A family of stereoscopic image compression algorithms using wavelet transforms
With the standardization of JPEG-2000, wavelet-based image and video
compression technologies are gradually replacing the popular DCT-based methods. In
parallel to this, recent developments in autostereoscopic display technology is now
threatening to revolutionize the way in which consumers are used to enjoying the
traditional 2D display based electronic media such as television, computer and
movies. However, due to the two-fold bandwidth/storage space requirement of
stereoscopic imaging, an essential requirement of a stereo imaging system is efficient
data compression.
In this thesis, seven wavelet-based stereo image compression algorithms are
proposed, to take advantage of the higher data compaction capability and better
flexibility of wavelets. In the proposed CODEC I, block-based disparity
estimation/compensation (DE/DC) is performed in pixel domain. However, this
results in an inefficiency when DWT is applied on the whole predictive error image
that results from the DE process. This is because of the existence of artificial block
boundaries between error blocks in the predictive error image. To overcome this
problem, in the remaining proposed CODECs, DE/DC is performed in the wavelet
domain. Due to the multiresolution nature of the wavelet domain, two methods of
disparity estimation and compensation have been proposed. The first method is
performing DEJDC in each subband of the lowest/coarsest resolution level and then
propagating the disparity vectors obtained to the corresponding subbands of
higher/finer resolution. Note that DE is not performed in every subband due to the
high overhead bits that could be required for the coding of disparity vectors of all
subbands. This method is being used in CODEC II. In the second method, DEJDC is
performed m the wavelet-block domain. This enables disparity estimation to be
performed m all subbands simultaneously without increasing the overhead bits
required for the coding disparity vectors. This method is used by CODEC III.
However, performing disparity estimation/compensation in all subbands would result
in a significant improvement of CODEC III. To further improve the performance of
CODEC ill, pioneering wavelet-block search technique is implemented in CODEC
IV. The pioneering wavelet-block search technique enables the right/predicted image
to be reconstructed at the decoder end without the need of transmitting the disparity
vectors. In proposed CODEC V, pioneering block search is performed in all subbands
of DWT decomposition which results in an improvement of its performance. Further,
the CODEC IV and V are able to perform at very low bit rates(< 0.15 bpp). In
CODEC VI and CODEC VII, Overlapped Block Disparity Compensation (OBDC) is
used with & without the need of coding disparity vector. Our experiment results
showed that no significant coding gains could be obtained for these CODECs over
CODEC IV & V.
All proposed CODECs m this thesis are wavelet-based stereo image coding
algorithms that maximise the flexibility and benefits offered by wavelet transform
technology when applied to stereo imaging. In addition the use of a baseline-JPEG
coding architecture would enable the easy adaptation of the proposed algorithms
within systems originally built for DCT-based coding. This is an important feature
that would be useful during an era where DCT-based technology is only slowly being
phased out to give way for DWT based compression technology.
In addition, this thesis proposed a stereo image coding algorithm that uses JPEG-2000
technology as the basic compression engine. The proposed CODEC, named RASTER
is a rate scalable stereo image CODEC that has a unique ability to preserve the image
quality at binocular depth boundaries, which is an important requirement in the design
of stereo image CODEC. The experimental results have shown that the proposed
CODEC is able to achieve PSNR gains of up to 3.7 dB as compared to directly
transmitting the right frame using JPEG-2000
Recommended from our members
Intelligent Side Information Generation in Distributed Video Coding
Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding to one where the computational cost is largely incurred by the decoder. This is attractive as the proven theoretical work of Wyner-Ziv (WZ) and Slepian-Wolf (SW) shows that the performance by such a system should be exactly the same as a conventional coder. Despite the solid theoretical foundations, current DVC qualitative and quantitative performance falls short of existing conventional coders and there remain crucial limitations. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have usually been based on linear motion compensated temporal interpolation (LMCTI), though these do not always produce satisfactory SI quality, especially in sequences exhibiting non-linear motion.
This thesis presents an intelligent higher order piecewise trajectory temporal interpolation (HOPTTI) framework for SI generation with original contributions that afford better SI quality in comparison to existing LMCTI-based approaches. The major elements in this framework are: (i) a cubic trajectory interpolation algorithm model that significantly improves the accuracy of motion vector estimations; (ii) an adaptive overlapped block motion compensation (AOBMC) model which reduces both blocking and overlapping artefacts in the SI emanating from the block matching algorithm; (iii) the development of an empirical mode switching algorithm; and (iv) an intelligent switching mechanism to construct SI by automatically selecting the best macroblock from the intermediate SI generated by HOPTTI and AOBMC algorithms. Rigorous analysis and evaluation confirms that significant quantitative and perceptual improvements in SI quality are achieved with the new framework
- โฆ