4,260 research outputs found
Recommended from our members
Wyner-Ziv side information generation using a higher order piecewise trajectory temporal interpolation algorithm
Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding, to one where the computational cost is largely incurred by the decoder. This enables low-cost, resource-poor sensors to be used at the transmitter in various applications including multi-sensor surveillance. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have generally been based on linear temporal interpolation, though these do not always produce satisfactory SI quality especially in sequences exhibiting asymmetric (non-linear) motion. This paper presents a higher-order piecewise trajectory temporal interpolation (HOPTTI) algorithm for SI generation that quantitatively and perceptually affords better SI quality in comparison to existing temporal interpolation-based approaches
Detection of dirt impairments from archived film sequences : survey and evaluations
Film dirt is the most commonly encountered artifact in archive restoration applications. Since dirt usually appears as a temporally impulsive event, motion-compensated interframe processing is widely applied for its detection. However, motion-compensated prediction requires a high degree of complexity and can be unreliable when motion estimation fails. Consequently, many techniques using spatial or spatiotemporal filtering without motion were also been proposed as alternatives. A comprehensive survey and evaluation of existing methods is presented, in which both qualitative and quantitative performances are compared in terms of accuracy, robustness, and complexity. After analyzing these algorithms and identifying their limitations, we conclude with guidance in choosing from these algorithms and promising directions for future research
๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ์ํ ๋ค์ค ๋ฒกํฐ ๊ธฐ๋ฐ์ MEMC ๋ฐ ์ฌ์ธต CNN
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2019. 2. ์ดํ์ฌ.Block-based hierarchical motion estimations are widely used and are successful in generating high-quality interpolation. However, it still fails in the motion estimation of small objects when a background region moves in a different direction. This is because the motion of small objects is neglected by the down-sampling and over-smoothing operations at the top level of image pyramids in the maximum a posterior (MAP) method. Consequently, the motion vector of small objects cannot be detected at the bottom level, and therefore, the small objects often appear deformed in an interpolated frame. This thesis proposes a novel algorithm that preserves the motion vector of the small objects by adding a secondary motion vector candidate that represents the movement of the small objects. This additional candidate is always propagated from the top to the bottom layers of the image pyramid. Experimental results demonstrate that the intermediate frame interpolated by the proposed algorithm significantly improves the visual quality when compared with conventional MAP-based frame interpolation.
In motion compensated frame interpolation, a repetition pattern in an image makes it difficult to derive an accurate motion vector because multiple similar local minima exist in the search space of the matching cost for motion estimation. In order to improve the accuracy of motion estimation in a repetition region, this thesis attempts a semi-global approach that exploits both local and global characteristics of a repetition region. A histogram of the motion vector candidates is built by using a voter based voting system that is more reliable than an elector based voting system. Experimental results demonstrate that the proposed method significantly outperforms the previous local approach in term of both objective peak signal-to-noise ratio (PSNR) and subjective visual quality.
In video frame interpolation or motion-compensated frame rate up-conversion (MC-FRUC), motion compensation along unidirectional motion trajectories directly causes overlaps and holes issues. To solve these issues, this research presents a new algorithm for bidirectional motion compensated frame interpolation. Firstly, the proposed method generates bidirectional motion vectors from two unidirectional motion vector fields (forward and backward) obtained from the unidirectional motion estimations. It is done by projecting the forward and backward motion vectors into the interpolated frame. A comprehensive metric as an extension of the distance between a projected block and an interpolated block is proposed to compute weighted coefficients in the case when the interpolated block has multiple projected ones. Holes are filled based on vector median filter of non-hole available neighbor blocks. The proposed method outperforms existing MC-FRUC methods and removes block artifacts significantly.
Video frame interpolation with a deep convolutional neural network (CNN) is also investigated in this thesis. Optical flow and video frame interpolation are considered as a chicken-egg problem such that one problem affects the other and vice versa. This thesis presents a stack of networks that are trained to estimate intermediate optical flows from the very first intermediate synthesized frame and later the very end interpolated frame is generated by the second synthesis network that is fed by stacking the very first one and two learned intermediate optical flows based warped frames. The primary benefit is that it glues two problems into one comprehensive framework that learns altogether by using both an analysis-by-synthesis technique for optical flow estimation and vice versa, CNN kernels based synthesis-by-analysis. The proposed network is the first attempt to bridge two branches of previous approaches, optical flow based synthesis and CNN kernels based synthesis into a comprehensive network. Experiments are carried out with various challenging datasets, all showing that the proposed network outperforms the state-of-the-art methods with significant margins for video frame interpolation and the estimated optical flows are accurate for challenging movements. The proposed deep video frame interpolation network to post-processing is applied to the improvement of the coding efficiency of the state-of-art video compress standard, HEVC/H.265 and experimental results prove the efficiency of the proposed network.๋ธ๋ก ๊ธฐ๋ฐ ๊ณ์ธต์ ์์ง์ ์ถ์ ์ ๊ณ ํ์ง์ ๋ณด๊ฐ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์์ด ํญ๋๊ฒ ์ฌ์ฉ๋๊ณ ์๋ค. ํ์ง๋ง, ๋ฐฐ๊ฒฝ ์์ญ์ด ์์ง์ผ ๋, ์์ ๋ฌผ์ฒด์ ๋ํ ์์ง์ ์ถ์ ์ฑ๋ฅ์ ์ฌ์ ํ ์ข์ง ์๋ค. ์ด๋ maximum a posterior (MAP) ๋ฐฉ์์ผ๋ก ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ต์์ ๋ ๋ฒจ์์ down-sampling๊ณผ over-smoothing์ผ๋ก ์ธํด ์์ ๋ฌผ์ฒด์ ์์ง์์ด ๋ฌด์๋๊ธฐ ๋๋ฌธ์ด๋ค. ๊ฒฐ๊ณผ์ ์ผ๋ก ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ตํ์ ๋ ๋ฒจ์์ ์์ ๋ฌผ์ฒด์ ์์ง์ ๋ฒกํฐ๋ ๊ฒ์ถ๋ ์ ์์ด ๋ณด๊ฐ ์ด๋ฏธ์ง์์ ์์ ๋ฌผ์ฒด๋ ์ข
์ข
๋ณํ๋ ๊ฒ์ฒ๋ผ ๋ณด์ธ๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์์ ๋ฌผ์ฒด์ ์์ง์์ ๋ํ๋ด๋ 2์ฐจ ์์ง์ ๋ฒกํฐ ํ๋ณด๋ฅผ ์ถ๊ฐํ์ฌ ์์ ๋ฌผ์ฒด์ ์์ง์ ๋ฒกํฐ๋ฅผ ๋ณด์กดํ๋ ์๋ก์ด ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค. ์ถ๊ฐ๋ ์์ง์ ๋ฒกํฐ ํ๋ณด๋ ํญ์ ์ด๋ฏธ์ง ํผ๋ผ๋ฏธ๋์ ์ต์์์์ ์ตํ์ ๋ ๋ฒจ๋ก ์ ํ๋๋ค. ์คํ ๊ฒฐ๊ณผ๋ ์ ์๋ ์๊ณ ๋ฆฌ์ฆ์ ๋ณด๊ฐ ์์ฑ ํ๋ ์์ด ๊ธฐ์กด MAP ๊ธฐ๋ฐ ๋ณด๊ฐ ๋ฐฉ์์ผ๋ก ์์ฑ๋ ํ๋ ์๋ณด๋ค ์ด๋ฏธ์ง ํ์ง์ด ์๋นํ ํฅ์๋จ์ ๋ณด์ฌ์ค๋ค.
์์ง์ ๋ณด์ ํ๋ ์ ๋ณด๊ฐ์์, ์ด๋ฏธ์ง ๋ด์ ๋ฐ๋ณต ํจํด์ ์์ง์ ์ถ์ ์ ์ํ ์ ํฉ ์ค์ฐจ ํ์ ์ ๋ค์์ ์ ์ฌ local minima๊ฐ ์กด์ฌํ๊ธฐ ๋๋ฌธ์ ์ ํํ ์์ง์ ๋ฒกํฐ ์ ๋๋ฅผ ์ด๋ ต๊ฒ ํ๋ค. ๋ณธ ๋
ผ๋ฌธ์ ๋ฐ๋ณต ํจํด์์์ ์์ง์ ์ถ์ ์ ์ ํ๋๋ฅผ ํฅ์์ํค๊ธฐ ์ํด ๋ฐ๋ณต ์์ญ์ localํ ํน์ฑ๊ณผ globalํ ํน์ฑ์ ๋์์ ํ์ฉํ๋ semi-globalํ ์ ๊ทผ์ ์๋ํ๋ค. ์์ง์ ๋ฒกํฐ ํ๋ณด์ ํ์คํ ๊ทธ๋จ์ ์ ๊ฑฐ ๊ธฐ๋ฐ ํฌํ ์์คํ
๋ณด๋ค ์ ๋ขฐํ ์ ์๋ ์ ๊ถ์ ๊ธฐ๋ฐ ํฌํ ์์คํ
๊ธฐ๋ฐ์ผ๋ก ํ์ฑ๋๋ค. ์คํ ๊ฒฐ๊ณผ๋ ์ ์๋ ๋ฐฉ๋ฒ์ด ์ด์ ์ localํ ์ ๊ทผ๋ฒ๋ณด๋ค peak signal-to-noise ratio (PSNR)์ ์ฃผ๊ด์ ํ์ง ํ๋จ ๊ด์ ์์ ์๋นํ ์ฐ์ํจ์ ๋ณด์ฌ์ค๋ค.
๋น๋์ค ํ๋ ์ ๋ณด๊ฐ ๋๋ ์์ง์ ๋ณด์ ํ๋ ์์จ ์ํฅ ๋ณํ (MC-FRUC)์์, ๋จ๋ฐฉํฅ ์์ง์ ๊ถค์ ์ ๋ฐ๋ฅธ ์์ง์ ๋ณด์์ overlap๊ณผ hole ๋ฌธ์ ๋ฅผ ์ผ์ผํจ๋ค. ๋ณธ ์ฐ๊ตฌ์์ ์ด๋ฌํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์๋ฐฉํฅ ์์ง์ ๋ณด์ ํ๋ ์ ๋ณด๊ฐ์ ์ํ ์๋ก์ด ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค. ๋จผ์ , ์ ์๋ ๋ฐฉ๋ฒ์ ๋จ๋ฐฉํฅ ์์ง์ ์ถ์ ์ผ๋ก๋ถํฐ ์ป์ด์ง ๋ ๊ฐ์ ๋จ๋ฐฉํฅ ์์ง์ ์์ญ(์ ๋ฐฉ ๋ฐ ํ๋ฐฉ)์ผ๋ก๋ถํฐ ์๋ฐฉํฅ ์์ง์ ๋ฒกํฐ๋ฅผ ์์ฑํ๋ค. ์ด๋ ์ ๋ฐฉ ๋ฐ ํ๋ฐฉ ์์ง์ ๋ฒกํฐ๋ฅผ ๋ณด๊ฐ ํ๋ ์์ ํฌ์ํจ์ผ๋ก์จ ์ํ๋๋ค. ๋ณด๊ฐ๋ ๋ธ๋ก์ ์ฌ๋ฌ ๊ฐ์ ํฌ์๋ ๋ธ๋ก์ด ์๋ ๊ฒฝ์ฐ, ํฌ์๋ ๋ธ๋ก๊ณผ ๋ณด๊ฐ๋ ๋ธ๋ก ์ฌ์ด์ ๊ฑฐ๋ฆฌ๋ฅผ ํ์ฅํ๋ ๊ธฐ์ค์ด ๊ฐ์ค ๊ณ์๋ฅผ ๊ณ์ฐํ๊ธฐ ์ํด ์ ์๋๋ค. Hole์ hole์ด ์๋ ์ด์ ๋ธ๋ก์ vector median filter๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ์ฒ๋ฆฌ๋๋ค. ์ ์ ๋ฐฉ๋ฒ์ ๊ธฐ์กด์ MC-FRUC๋ณด๋ค ์ฑ๋ฅ์ด ์ฐ์ํ๋ฉฐ, ๋ธ๋ก ์ดํ๋ฅผ ์๋นํ ์ ๊ฑฐํ๋ค.
๋ณธ ๋
ผ๋ฌธ์์๋ CNN์ ์ด์ฉํ ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ๋ํด์๋ ๋ค๋ฃฌ๋ค. Optical flow ๋ฐ ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ์ ํ ๊ฐ์ง ๋ฌธ์ ๊ฐ ๋ค๋ฅธ ๋ฌธ์ ์ ์ํฅ์ ๋ฏธ์น๋ chicken-egg ๋ฌธ์ ๋ก ๊ฐ์ฃผ๋๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ค๊ฐ optical flow ๋ฅผ ๊ณ์ฐํ๋ ๋คํธ์ํฌ์ ๋ณด๊ฐ ํ๋ ์์ ํฉ์ฑ ํ๋ ๋ ๊ฐ์ง ๋คํธ์ํฌ๋ก ์ด๋ฃจ์ด์ง ํ๋์ ๋คํธ์ํฌ ์คํ์ ๊ตฌ์กฐ๋ฅผ ์ ์ํ๋ค. The final ๋ณด๊ฐ ํ๋ ์์ ์์ฑํ๋ ๋คํธ์ํฌ์ ๊ฒฝ์ฐ ์ฒซ ๋ฒ์งธ ๋คํธ์ํฌ์ ์ถ๋ ฅ์ธ ๋ณด๊ฐ ํ๋ ์ ์ ์ค๊ฐ optical flow based warped frames์ ์
๋ ฅ์ผ๋ก ๋ฐ์์ ํ๋ ์์ ์์ฑํ๋ค. ์ ์๋ ๊ตฌ์กฐ์ ๊ฐ์ฅ ํฐ ํน์ง์ optical flow ๊ณ์ฐ์ ์ํ ํฉ์ฑ์ ์ํ ๋ถ์๋ฒ๊ณผ CNN ๊ธฐ๋ฐ์ ๋ถ์์ ์ํ ํฉ์ฑ๋ฒ์ ๋ชจ๋ ์ด์ฉํ์ฌ ํ๋์ ์ข
ํฉ์ ์ธ framework๋ก ๊ฒฐํฉํ์๋ค๋ ๊ฒ์ด๋ค. ์ ์๋ ๋คํธ์ํฌ๋ ๊ธฐ์กด์ ๋ ๊ฐ์ง ์ฐ๊ตฌ์ธ optical flow ๊ธฐ๋ฐ ํ๋ ์ ํฉ์ฑ๊ณผ CNN ๊ธฐ๋ฐ ํฉ์ฑ ํ๋ ์ ํฉ์ฑ๋ฒ์ ์ฒ์ ๊ฒฐํฉ์ํจ ๋ฐฉ์์ด๋ค. ์คํ์ ๋ค์ํ๊ณ ๋ณต์กํ ๋ฐ์ดํฐ ์
์ผ๋ก ์ด๋ฃจ์ด์ก์ผ๋ฉฐ, ๋ณด๊ฐ ํ๋ ์ quality ์ optical flow ๊ณ์ฐ ์ ํ๋ ์ธก๋ฉด์์ ๊ธฐ์กด์ state-of-art ๋ฐฉ์์ ๋นํด ์๋ฑํ ๋์ ์ฑ๋ฅ์ ๋ณด์๋ค. ๋ณธ ๋
ผ๋ฌธ์ ํ ์ฒ๋ฆฌ๋ฅผ ์ํ ์ฌ์ธต ๋น๋์ค ํ๋ ์ ๋ณด๊ฐ ๋คํธ์ํฌ๋ ์ฝ๋ฉ ํจ์จ ํฅ์์ ์ํด ์ต์ ๋น๋์ค ์์ถ ํ์ค์ธ HEVC/H.265์ ์ ์ฉํ ์ ์์ผ๋ฉฐ, ์คํ ๊ฒฐ๊ณผ๋ ์ ์ ๋คํธ์ํฌ์ ํจ์จ์ฑ์ ์
์ฆํ๋ค.Abstract i
Table of Contents iv
List of Tables vii
List of Figures viii
Chapter 1. Introduction 1
1.1. Hierarchical Motion Estimation of Small Objects 2
1.2. Motion Estimation of a Repetition Pattern Region 4
1.3. Motion-Compensated Frame Interpolation 5
1.4. Video Frame Interpolation with Deep CNN 6
1.5. Outline of the Thesis 7
Chapter 2. Previous Works 9
2.1. Previous Works on Hierarchical Block-Based Motion Estimation 9
2.1.1.โMaximum a Posterior (MAP) Framework 10
2.1.2.Hierarchical Motion Estimation 12
2.2. Previous Works on Motion Estimation for a Repetition Pattern Region 13
2.3. Previous Works on Motion Compensation 14
2.4. Previous Works on Video Frame Interpolation with Deep CNN 16
Chapter 3. Hierarchical Motion Estimation for Small Objects 19
3.1. Problem Statement 19
3.2. The Alternative Motion Vector of High Cost Pixels 20
3.3. Modified Hierarchical Motion Estimation 23
3.4. Framework of the Proposed Algorithm 24
3.5. Experimental Results 25
3.5.1. Performance Analysis 26
3.5.2. Performance Evaluation 29
Chapter 4. Semi-Global Accurate Motion Estimation for a Repetition Pattern Region 32
4.1. Problem Statement 32
4.2. Objective Function and Constrains 33
4.3. Elector based Voting System 34
4.4. Voter based Voting System 36
4.5. Experimental Results 40
Chapter 5. Multiple Motion Vectors based Motion Compensation 44
5.1. Problem Statement 44
5.2. Adaptive Weighted Multiple Motion Vectors based Motion Compensation 45
5.2.1. One-to-Multiple Motion Vector Projection 45
5.2.2. A Comprehensive Metric as the Extension of Distance 48
5.3. Handling Hole Blocks 49
5.4. Framework of the Proposed Motion Compensated Frame Interpolation 50
5.5. Experimental Results 51
Chapter 6. Video Frame Interpolation with a Stack of Deep CNN 56
6.1. Problem Statement 56
6.2. The Proposed Network for Video Frame Interpolation 57
6.2.1. A Stack of Synthesis Networks 57
6.2.2. Intermediate Optical Flow Derivation Module 60
6.2.3. Warping Operations 62
6.2.4. Training and Loss Function 63
6.2.5. Network Architecture 64
6.2.6. Experimental Results 64
6.2.6.1. Frame Interpolation Evaluation 64
6.2.6.2. Ablation Experiments 77
6.3. Extension for Quality Enhancement for Compressed Videos Task 83
6.4. Extension for Improving the Coding Efficiency of HEVC based Low Bitrate Encoder 88
Chapter 7. Conclusion 94
References 97Docto
Fusion of Global and Local Motion Estimation Using Foreground Objects for Distributed Video Coding
International audienceThe side information in distributed video coding is estimated using the available decoded frames, and exploited for the decoding and reconstruction of other frames. The quality of the side information has a strong impact on the performance of distributed video coding. Here we propose a new approach that combines both global and local side information to improve coding performance. Since the background pixels in a frame are assigned to global estimation and the foreground objects to local estimation, one needs to estimate foreground objects in the side information using the backward and forward foreground objects, The background pixels are directly taken from the global side information. Specifically, elastic curves and local motion compensation are used to generate the foreground objects masks in the side information. Experimental results show that, as far as the rate-distortion performance is concerned, the proposed approach can achieve a PSNR improvement of up to 1.39 dB for a GOP size of 2, and up to 4.73 dB for larger GOP sizes, with respect to the reference DISCOVER codec. Index Terms A. ABOU-ELAILAH, F. DUFAUX, M. CAGNAZZO, and B. PESQUET-POPESCU are with the Signal and Image Processin
Towards Hybrid-Optimization Video Coding
Video coding is a mathematical optimization problem of rate and distortion
essentially. To solve this complex optimization problem, two popular video
coding frameworks have been developed: block-based hybrid video coding and
end-to-end learned video coding. If we rethink video coding from the
perspective of optimization, we find that the existing two frameworks represent
two directions of optimization solutions. Block-based hybrid coding represents
the discrete optimization solution because those irrelevant coding modes are
discrete in mathematics. It searches for the best one among multiple starting
points (i.e. modes). However, the search is not efficient enough. On the other
hand, end-to-end learned coding represents the continuous optimization solution
because the gradient descent is based on a continuous function. It optimizes a
group of model parameters efficiently by the numerical algorithm. However,
limited by only one starting point, it is easy to fall into the local optimum.
To better solve the optimization problem, we propose to regard video coding as
a hybrid of the discrete and continuous optimization problem, and use both
search and numerical algorithm to solve it. Our idea is to provide multiple
discrete starting points in the global space and optimize the local optimum
around each point by numerical algorithm efficiently. Finally, we search for
the global optimum among those local optimums. Guided by the hybrid
optimization idea, we design a hybrid optimization video coding framework,
which is built on continuous deep networks entirely and also contains some
discrete modes. We conduct a comprehensive set of experiments. Compared to the
continuous optimization framework, our method outperforms pure learned video
coding methods. Meanwhile, compared to the discrete optimization framework, our
method achieves comparable performance to HEVC reference software HM16.10 in
PSNR
Recommended from our members
Intelligent Side Information Generation in Distributed Video Coding
Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding to one where the computational cost is largely incurred by the decoder. This is attractive as the proven theoretical work of Wyner-Ziv (WZ) and Slepian-Wolf (SW) shows that the performance by such a system should be exactly the same as a conventional coder. Despite the solid theoretical foundations, current DVC qualitative and quantitative performance falls short of existing conventional coders and there remain crucial limitations. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have usually been based on linear motion compensated temporal interpolation (LMCTI), though these do not always produce satisfactory SI quality, especially in sequences exhibiting non-linear motion.
This thesis presents an intelligent higher order piecewise trajectory temporal interpolation (HOPTTI) framework for SI generation with original contributions that afford better SI quality in comparison to existing LMCTI-based approaches. The major elements in this framework are: (i) a cubic trajectory interpolation algorithm model that significantly improves the accuracy of motion vector estimations; (ii) an adaptive overlapped block motion compensation (AOBMC) model which reduces both blocking and overlapping artefacts in the SI emanating from the block matching algorithm; (iii) the development of an empirical mode switching algorithm; and (iv) an intelligent switching mechanism to construct SI by automatically selecting the best macroblock from the intermediate SI generated by HOPTTI and AOBMC algorithms. Rigorous analysis and evaluation confirms that significant quantitative and perceptual improvements in SI quality are achieved with the new framework
A Review Paper on Video De-Interlacing Multiple Techniques
In this paper present video interlacing de-interlacing and various techniques. Focus on the different techniques of video De- Interlacing that are Intra Field, Inter Field, Motion Adaptive, Motion Compensated De- interlacing and Spatio-Temporal Interpolation. De- Interlaced video use the full resolution of each scan so produced high quality image and remove flicker problem. Techniques are work on the scan line of object Intra Field techniques use pixels of the moving object, Inter Field works on stationary regions of object, Motion Adaptive works on the edge of the Object and Motion Compensation focus video sequence and brightness variation. Advantage of using De-interlacing technique is: Better Moving object image, no flickers and high vertical resolution
- โฆ