2 research outputs found
A Study on Frame Prediction Method based on Operation Probability Map
λμμλ΄μμ μμμ μν΄ μμ€λ νλ μμ 볡μνκ±°λ μ°μμ μΈ μλ‘μ΄ νλ μμ μμ±νλ κΈ°λ²μΈ νλ μ μμΈ‘μ κ°μ²΄λ€μ λμ μμΈ‘μ΄ νμν μμ¨μ£Όν, 보μ λ±μ λ―Έλ μ£Όμ κΈ°μ λ‘μ μ£Όλͺ©λ°κ³ μλ€. μ΅κ·Ό μ΄ κΈ°μ μ λ₯λ¬λ κΈ°μ κ³Ό κ²°ν©νμ¬ μμΈ‘ μ νλκ° λ§μ΄ ν₯μλκ³ μμΌλ λ§μ νμ΅λ°μ΄ν°μ μ°μ°λμ΄ μλ°λκΈ° λλ¬Έμ μ€μ§μ μΈ μ μ©μλ μ΄λ €μμ΄ μ‘΄μ¬νλ€. κΈ°μ‘΄μ λ₯λ¬λ κΈ°λ° μμΈ‘ λͺ¨λΈμ μλ‘μ΄ νλ μ μμ± κ³Όμ μμ μμΈ‘μ μν΄ μμ±λ νλ μμ νΌλλ°±νκΈ° λλ¬Έμ λμ μ€μ°¨κ° λ§μ΄ λ°μνμ¬ μκ°μ΄ μ§λ¨μ λ°λΌ μμΈ‘ μ νλκ° κ°μνλ€. λ°λΌμ λ³Έ λ
Όλ¬Έμμλ convolution neural network (CNN)μ long short-term memory (LSTM)μΌλ‘ ꡬμ±λ λ€νΈμν¬λ₯Ό ν΅ν΄ νλ μλ€μ λμ νΉμ§λ€μ μΆμΆνκ³ ν¨ν΄μ νμ΅νμ¬ λμ νλ₯ μ§λλ₯Ό μμ±νμ¬ μμ§μμ΄ λ°μν μμμ λνμ¬ deconvolution neural network(DNN)λ₯Ό ν΅ν΄ μ΄ν νλ μμ μμ±νλ μλ‘μ΄ νλ μ μμΈ‘ λͺ¨λΈμ μ μνλ€. μ μν λͺ¨λΈμ CNNκ³Ό LSTMμ ν΅ν΄ νλ μλ€μ λμ νΉμ§λ€μ μΆμΆνκ³ ν¨ν΄μ νμ΅νμ¬ λμ νλ₯ μ§λλ₯Ό μμ±νλ€. μ΄λ₯Ό ν΅ν΄ μμμ ν νλ μμμ λμμ΄ λ°μνλ μμλ₯Ό νλ³νκ³ μ΄ μμλ§ DNNμ ν΅ν΄ μλ‘μ΄ νλ μμ νλνλ€. μ΄λ νμ΅ λμ΄λκ° λμ DNNμ ν¨μ¨μ μΈ νμ΅μ μν΄ generative adversarial nets(GAN) κΈ°λ²μ μ μ©νλ€. μ μλ μλ‘μ΄ λͺ¨λΈμ νμ΅κ³Ό κ²μ¦μ μνμ¬ λ¬΄μμλ‘ μΌλΆ νλ μμ΄ μ κ±°λ λ‘λ΄ μμ§μ μμμ κΈ°λ°μΌλ‘ μμ±λ μμκ³Ό μλ³Έ μμμ PSNRλ‘ λΉκ΅ λΆμνμλ€. κ·Έ κ²°κ³Ό, μ μν νλ μ μμΈ‘ λͺ¨λΈμ PSNRμ 35.16μΌλ‘ λΉκ΅ν 3κ°μ λ€λ₯Έ λͺ¨λΈμ λΉν΄ μ΅λ 14.06μ΄ ν₯μλμλ€. λν μμ±λ νλ μμ λ°λ₯Έ PSNRμ κ°μλ 4λ²μ§Έ νλ μ μ΄μ μλ 2, μ΄νμλ 7λ‘ νκ· 5κ° κ°μ λμλ€.|Frame prediction, which is a technique to reconstruct frames lost due to damage or to generate new consecutive frames in the video, is attracting attention as a main technology which is indispensable for the autonomous vehicle and the artificial intelligence based security system that require motion prediction of objects. Recently, this technology has improved prediction accuracy in combination with deep learning technology, but it is difficulties in practical application because it involves a lot of learning data and computation amount. The existing deep learning based prediction model, since the frame generated by the prediction is feedback in the new frame generation process, is decreased the prediction accuracy over time. Therefore, in this paper, we propose an operation probability map based new frame prediction model using convolution neural network (CNN), long short-term, memory (LSTM) and deconvolution neural network(DNN) to minimize unnecessary computation regions in the frame and prediction error. The proposed model extracts the operating characteristics of the frames through CNN and LSTM and learns the patterns to generate the operation probability map. Through this process, a region in which an operation occurs is determined in one frame, and a new frame is obtained through DNN only in this region. At this time, the generative adversarial nets(GAN) technique is applied for efficient learning of DNN with the high learning complexity. For the learning and verification of the proposed new model, we compared and analyzed the generated frame and the original frame based on robotic motion images with some frames removed randomly using PSNR. As a result, the PSNR of the proposed frame prediction model is 35.16, which is 14.06 higher than the other three models. Also, the decrease of the PSNR according to the generated frame is decreased to 2 before the 4th frame and then to 7 thereafter, and is improved by 5 on the average.Chapter 1 Introduction 01
Chapter 2 Related Works 06
2.1 Convolution Neural Network 06
2.2 Long Short-Term Memory 09
2.3 Generative Adversarial Nets 12
Chapter 3 The Proposed Prediction Model 15
3.1 Structure of the proposed model 17
3.2 Model for feature extraction and operation probability estimation 21
3.3 Model for generating and combining images 24
3.4 Model for learning of generative model 27
Chapter 4 Experiment and Result 29
4.1 Dataset for learning and testing 29
4.2 Analysis of experimental results 30
Chapter 5 Conclusion 37
Reference 38Maste
Novel Motion Anchoring Strategies for Wavelet-based Highly Scalable Video Compression
This thesis investigates new motion anchoring strategies that are targeted at wavelet-based highly scalable video compression (WSVC). We depart from two practices that are deeply ingrained in existing video compression systems. Instead of the commonly used block motion, which has poor scalability attributes, we employ piecewise-smooth motion together with a highly scalable motion boundary description. The combination of this more βphysicalβ motion description together with motion discontinuity information allows us to change the conventional strategy of anchoring motion at target frames to anchoring motion at reference frames, which improves motion inference across time.
In the proposed reference-based motion anchoring strategies, motion fields are mapped from reference to target frames, where they serve as prediction references; during this mapping process, disoccluded regions are readily discovered. Observing that motion discontinuities displace with foreground objects, we propose motion-discontinuity driven motion mapping operations that handle traditionally challenging regions around moving objects. The reference-based motion anchoring exposes an intricate connection between temporal frame interpolation (TFI) and video compression. When employed in a compression system, all anchoring strategies explored in this thesis perform TFI once all residual information is quantized to zero at a given temporal level. The interpolation performance is evaluated on both natural and synthetic sequences, where we show favourable comparisons with state-of-the-art TFI schemes.
We explore three reference-based motion anchoring strategies. In the first one, the motion anchoring is βflippedβ with respect to a hierarchical B-frame structure. We develop an analytical model to determine the weights of the different spatio-temporal subbands, and assess the suitability and benefits of this reference-based WSVC for (highly scalable) video compression. Reduced motion coding cost and improved frame prediction, especially around moving objects, result in improved rate-distortion performance compared to a target-based WSVC. As the thesis evolves, the motion anchoring is progressively simplified to one where all motion is anchored at one base frame; this central motion organization facilitates the incorporation of higher-order motion models, which improve the prediction performance in regions following motion with non-constant velocity