701 research outputs found

    Towards Hybrid-Optimization Video Coding

    Full text link
    Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimization solutions. Block-based hybrid coding represents the discrete optimization solution because those irrelevant coding modes are discrete in mathematics. It searches for the best one among multiple starting points (i.e. modes). However, the search is not efficient enough. On the other hand, end-to-end learned coding represents the continuous optimization solution because the gradient descent is based on a continuous function. It optimizes a group of model parameters efficiently by the numerical algorithm. However, limited by only one starting point, it is easy to fall into the local optimum. To better solve the optimization problem, we propose to regard video coding as a hybrid of the discrete and continuous optimization problem, and use both search and numerical algorithm to solve it. Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently. Finally, we search for the global optimum among those local optimums. Guided by the hybrid optimization idea, we design a hybrid optimization video coding framework, which is built on continuous deep networks entirely and also contains some discrete modes. We conduct a comprehensive set of experiments. Compared to the continuous optimization framework, our method outperforms pure learned video coding methods. Meanwhile, compared to the discrete optimization framework, our method achieves comparable performance to HEVC reference software HM16.10 in PSNR

    ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„์„ ์œ„ํ•œ ๋‹ค์ค‘ ๋ฒกํ„ฐ ๊ธฐ๋ฐ˜์˜ MEMC ๋ฐ ์‹ฌ์ธต CNN

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2019. 2. ์ดํ˜์žฌ.Block-based hierarchical motion estimations are widely used and are successful in generating high-quality interpolation. However, it still fails in the motion estimation of small objects when a background region moves in a different direction. This is because the motion of small objects is neglected by the down-sampling and over-smoothing operations at the top level of image pyramids in the maximum a posterior (MAP) method. Consequently, the motion vector of small objects cannot be detected at the bottom level, and therefore, the small objects often appear deformed in an interpolated frame. This thesis proposes a novel algorithm that preserves the motion vector of the small objects by adding a secondary motion vector candidate that represents the movement of the small objects. This additional candidate is always propagated from the top to the bottom layers of the image pyramid. Experimental results demonstrate that the intermediate frame interpolated by the proposed algorithm significantly improves the visual quality when compared with conventional MAP-based frame interpolation. In motion compensated frame interpolation, a repetition pattern in an image makes it difficult to derive an accurate motion vector because multiple similar local minima exist in the search space of the matching cost for motion estimation. In order to improve the accuracy of motion estimation in a repetition region, this thesis attempts a semi-global approach that exploits both local and global characteristics of a repetition region. A histogram of the motion vector candidates is built by using a voter based voting system that is more reliable than an elector based voting system. Experimental results demonstrate that the proposed method significantly outperforms the previous local approach in term of both objective peak signal-to-noise ratio (PSNR) and subjective visual quality. In video frame interpolation or motion-compensated frame rate up-conversion (MC-FRUC), motion compensation along unidirectional motion trajectories directly causes overlaps and holes issues. To solve these issues, this research presents a new algorithm for bidirectional motion compensated frame interpolation. Firstly, the proposed method generates bidirectional motion vectors from two unidirectional motion vector fields (forward and backward) obtained from the unidirectional motion estimations. It is done by projecting the forward and backward motion vectors into the interpolated frame. A comprehensive metric as an extension of the distance between a projected block and an interpolated block is proposed to compute weighted coefficients in the case when the interpolated block has multiple projected ones. Holes are filled based on vector median filter of non-hole available neighbor blocks. The proposed method outperforms existing MC-FRUC methods and removes block artifacts significantly. Video frame interpolation with a deep convolutional neural network (CNN) is also investigated in this thesis. Optical flow and video frame interpolation are considered as a chicken-egg problem such that one problem affects the other and vice versa. This thesis presents a stack of networks that are trained to estimate intermediate optical flows from the very first intermediate synthesized frame and later the very end interpolated frame is generated by the second synthesis network that is fed by stacking the very first one and two learned intermediate optical flows based warped frames. The primary benefit is that it glues two problems into one comprehensive framework that learns altogether by using both an analysis-by-synthesis technique for optical flow estimation and vice versa, CNN kernels based synthesis-by-analysis. The proposed network is the first attempt to bridge two branches of previous approaches, optical flow based synthesis and CNN kernels based synthesis into a comprehensive network. Experiments are carried out with various challenging datasets, all showing that the proposed network outperforms the state-of-the-art methods with significant margins for video frame interpolation and the estimated optical flows are accurate for challenging movements. The proposed deep video frame interpolation network to post-processing is applied to the improvement of the coding efficiency of the state-of-art video compress standard, HEVC/H.265 and experimental results prove the efficiency of the proposed network.๋ธ”๋ก ๊ธฐ๋ฐ˜ ๊ณ„์ธต์  ์›€์ง์ž„ ์ถ”์ •์€ ๊ณ ํ™”์งˆ์˜ ๋ณด๊ฐ„ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์–ด ํญ๋„“๊ฒŒ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋ฐฐ๊ฒฝ ์˜์—ญ์ด ์›€์ง์ผ ๋•Œ, ์ž‘์€ ๋ฌผ์ฒด์— ๋Œ€ํ•œ ์›€์ง์ž„ ์ถ”์ • ์„ฑ๋Šฅ์€ ์—ฌ์ „ํžˆ ์ข‹์ง€ ์•Š๋‹ค. ์ด๋Š” maximum a posterior (MAP) ๋ฐฉ์‹์œผ๋กœ ์ด๋ฏธ์ง€ ํ”ผ๋ผ๋ฏธ๋“œ์˜ ์ตœ์ƒ์œ„ ๋ ˆ๋ฒจ์—์„œ down-sampling๊ณผ over-smoothing์œผ๋กœ ์ธํ•ด ์ž‘์€ ๋ฌผ์ฒด์˜ ์›€์ง์ž„์ด ๋ฌด์‹œ๋˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ด๋ฏธ์ง€ ํ”ผ๋ผ๋ฏธ๋“œ์˜ ์ตœํ•˜์œ„ ๋ ˆ๋ฒจ์—์„œ ์ž‘์€ ๋ฌผ์ฒด์˜ ์›€์ง์ž„ ๋ฒกํ„ฐ๋Š” ๊ฒ€์ถœ๋  ์ˆ˜ ์—†์–ด ๋ณด๊ฐ„ ์ด๋ฏธ์ง€์—์„œ ์ž‘์€ ๋ฌผ์ฒด๋Š” ์ข…์ข… ๋ณ€ํ˜•๋œ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ธ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ž‘์€ ๋ฌผ์ฒด์˜ ์›€์ง์ž„์„ ๋‚˜ํƒ€๋‚ด๋Š” 2์ฐจ ์›€์ง์ž„ ๋ฒกํ„ฐ ํ›„๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ž‘์€ ๋ฌผ์ฒด์˜ ์›€์ง์ž„ ๋ฒกํ„ฐ๋ฅผ ๋ณด์กดํ•˜๋Š” ์ƒˆ๋กœ์šด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์ถ”๊ฐ€๋œ ์›€์ง์ž„ ๋ฒกํ„ฐ ํ›„๋ณด๋Š” ํ•ญ์ƒ ์ด๋ฏธ์ง€ ํ”ผ๋ผ๋ฏธ๋“œ์˜ ์ตœ์ƒ์œ„์—์„œ ์ตœํ•˜์œ„ ๋ ˆ๋ฒจ๋กœ ์ „ํŒŒ๋œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ณด๊ฐ„ ์ƒ์„ฑ ํ”„๋ ˆ์ž„์ด ๊ธฐ์กด MAP ๊ธฐ๋ฐ˜ ๋ณด๊ฐ„ ๋ฐฉ์‹์œผ๋กœ ์ƒ์„ฑ๋œ ํ”„๋ ˆ์ž„๋ณด๋‹ค ์ด๋ฏธ์ง€ ํ™”์งˆ์ด ์ƒ๋‹นํžˆ ํ–ฅ์ƒ๋จ์„ ๋ณด์—ฌ์ค€๋‹ค. ์›€์ง์ž„ ๋ณด์ƒ ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„์—์„œ, ์ด๋ฏธ์ง€ ๋‚ด์˜ ๋ฐ˜๋ณต ํŒจํ„ด์€ ์›€์ง์ž„ ์ถ”์ •์„ ์œ„ํ•œ ์ •ํ•ฉ ์˜ค์ฐจ ํƒ์ƒ‰ ์‹œ ๋‹ค์ˆ˜์˜ ์œ ์‚ฌ local minima๊ฐ€ ์กด์žฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ •ํ™•ํ•œ ์›€์ง์ž„ ๋ฒกํ„ฐ ์œ ๋„๋ฅผ ์–ด๋ ต๊ฒŒ ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ๋ฐ˜๋ณต ํŒจํ„ด์—์„œ์˜ ์›€์ง์ž„ ์ถ”์ •์˜ ์ •ํ™•๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋ฐ˜๋ณต ์˜์—ญ์˜ localํ•œ ํŠน์„ฑ๊ณผ globalํ•œ ํŠน์„ฑ์„ ๋™์‹œ์— ํ™œ์šฉํ•˜๋Š” semi-globalํ•œ ์ ‘๊ทผ์„ ์‹œ๋„ํ•œ๋‹ค. ์›€์ง์ž„ ๋ฒกํ„ฐ ํ›„๋ณด์˜ ํžˆ์Šคํ† ๊ทธ๋žจ์€ ์„ ๊ฑฐ ๊ธฐ๋ฐ˜ ํˆฌํ‘œ ์‹œ์Šคํ…œ๋ณด๋‹ค ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์œ ๊ถŒ์ž ๊ธฐ๋ฐ˜ ํˆฌํ‘œ ์‹œ์Šคํ…œ ๊ธฐ๋ฐ˜์œผ๋กœ ํ˜•์„ฑ๋œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์ด ์ด์ „์˜ localํ•œ ์ ‘๊ทผ๋ฒ•๋ณด๋‹ค peak signal-to-noise ratio (PSNR)์™€ ์ฃผ๊ด€์  ํ™”์งˆ ํŒ๋‹จ ๊ด€์ ์—์„œ ์ƒ๋‹นํžˆ ์šฐ์ˆ˜ํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„ ๋˜๋Š” ์›€์ง์ž„ ๋ณด์ƒ ํ”„๋ ˆ์ž„์œจ ์ƒํ–ฅ ๋ณ€ํ™˜ (MC-FRUC)์—์„œ, ๋‹จ๋ฐฉํ–ฅ ์›€์ง์ž„ ๊ถค์ ์— ๋”ฐ๋ฅธ ์›€์ง์ž„ ๋ณด์ƒ์€ overlap๊ณผ hole ๋ฌธ์ œ๋ฅผ ์ผ์œผํ‚จ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์–‘๋ฐฉํ–ฅ ์›€์ง์ž„ ๋ณด์ƒ ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์‹œํ•œ๋‹ค. ๋จผ์ €, ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๋‹จ๋ฐฉํ–ฅ ์›€์ง์ž„ ์ถ”์ •์œผ๋กœ๋ถ€ํ„ฐ ์–ป์–ด์ง„ ๋‘ ๊ฐœ์˜ ๋‹จ๋ฐฉํ–ฅ ์›€์ง์ž„ ์˜์—ญ(์ „๋ฐฉ ๋ฐ ํ›„๋ฐฉ)์œผ๋กœ๋ถ€ํ„ฐ ์–‘๋ฐฉํ–ฅ ์›€์ง์ž„ ๋ฒกํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ด๋Š” ์ „๋ฐฉ ๋ฐ ํ›„๋ฐฉ ์›€์ง์ž„ ๋ฒกํ„ฐ๋ฅผ ๋ณด๊ฐ„ ํ”„๋ ˆ์ž„์— ํˆฌ์˜ํ•จ์œผ๋กœ์จ ์ˆ˜ํ–‰๋œ๋‹ค. ๋ณด๊ฐ„๋œ ๋ธ”๋ก์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํˆฌ์˜๋œ ๋ธ”๋ก์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ํˆฌ์˜๋œ ๋ธ”๋ก๊ณผ ๋ณด๊ฐ„๋œ ๋ธ”๋ก ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋ฅผ ํ™•์žฅํ•˜๋Š” ๊ธฐ์ค€์ด ๊ฐ€์ค‘ ๊ณ„์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ๋‹ค. Hole์€ hole์ด ์•„๋‹Œ ์ด์›ƒ ๋ธ”๋ก์˜ vector median filter๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ฒ˜๋ฆฌ๋œ๋‹ค. ์ œ์•ˆ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด์˜ MC-FRUC๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•˜๋ฉฐ, ๋ธ”๋ก ์—ดํ™”๋ฅผ ์ƒ๋‹นํžˆ ์ œ๊ฑฐํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” CNN์„ ์ด์šฉํ•œ ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„์— ๋Œ€ํ•ด์„œ๋„ ๋‹ค๋ฃฌ๋‹ค. Optical flow ๋ฐ ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„์€ ํ•œ ๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ๋‹ค๋ฅธ ๋ฌธ์ œ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” chicken-egg ๋ฌธ์ œ๋กœ ๊ฐ„์ฃผ๋œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ค‘๊ฐ„ optical flow ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋„คํŠธ์›Œํฌ์™€ ๋ณด๊ฐ„ ํ”„๋ ˆ์ž„์„ ํ•ฉ์„ฑ ํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ๋„คํŠธ์›Œํฌ๋กœ ์ด๋ฃจ์–ด์ง„ ํ•˜๋‚˜์˜ ๋„คํŠธ์›Œํฌ ์Šคํƒ์„ ๊ตฌ์กฐ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. The final ๋ณด๊ฐ„ ํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•˜๋Š” ๋„คํŠธ์›Œํฌ์˜ ๊ฒฝ์šฐ ์ฒซ ๋ฒˆ์งธ ๋„คํŠธ์›Œํฌ์˜ ์ถœ๋ ฅ์ธ ๋ณด๊ฐ„ ํ”„๋ ˆ์ž„ ์™€ ์ค‘๊ฐ„ optical flow based warped frames์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•œ๋‹ค. ์ œ์•ˆ๋œ ๊ตฌ์กฐ์˜ ๊ฐ€์žฅ ํฐ ํŠน์ง•์€ optical flow ๊ณ„์‚ฐ์„ ์œ„ํ•œ ํ•ฉ์„ฑ์— ์˜ํ•œ ๋ถ„์„๋ฒ•๊ณผ CNN ๊ธฐ๋ฐ˜์˜ ๋ถ„์„์— ์˜ํ•œ ํ•ฉ์„ฑ๋ฒ•์„ ๋ชจ๋‘ ์ด์šฉํ•˜์—ฌ ํ•˜๋‚˜์˜ ์ข…ํ•ฉ์ ์ธ framework๋กœ ๊ฒฐํ•ฉํ•˜์˜€๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์ œ์•ˆ๋œ ๋„คํŠธ์›Œํฌ๋Š” ๊ธฐ์กด์˜ ๋‘ ๊ฐ€์ง€ ์—ฐ๊ตฌ์ธ optical flow ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„ ํ•ฉ์„ฑ๊ณผ CNN ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ ํ”„๋ ˆ์ž„ ํ•ฉ์„ฑ๋ฒ•์„ ์ฒ˜์Œ ๊ฒฐํ•ฉ์‹œํ‚จ ๋ฐฉ์‹์ด๋‹ค. ์‹คํ—˜์€ ๋‹ค์–‘ํ•˜๊ณ  ๋ณต์žกํ•œ ๋ฐ์ดํ„ฐ ์…‹์œผ๋กœ ์ด๋ฃจ์–ด์กŒ์œผ๋ฉฐ, ๋ณด๊ฐ„ ํ”„๋ ˆ์ž„ quality ์™€ optical flow ๊ณ„์‚ฐ ์ •ํ™•๋„ ์ธก๋ฉด์—์„œ ๊ธฐ์กด์˜ state-of-art ๋ฐฉ์‹์— ๋น„ํ•ด ์›”๋“ฑํžˆ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ํ›„ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ์‹ฌ์ธต ๋น„๋””์˜ค ํ”„๋ ˆ์ž„ ๋ณด๊ฐ„ ๋„คํŠธ์›Œํฌ๋Š” ์ฝ”๋”ฉ ํšจ์œจ ํ–ฅ์ƒ์„ ์œ„ํ•ด ์ตœ์‹  ๋น„๋””์˜ค ์••์ถ• ํ‘œ์ค€์ธ HEVC/H.265์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆ ๋„คํŠธ์›Œํฌ์˜ ํšจ์œจ์„ฑ์„ ์ž…์ฆํ•œ๋‹ค.Abstract i Table of Contents iv List of Tables vii List of Figures viii Chapter 1. Introduction 1 1.1. Hierarchical Motion Estimation of Small Objects 2 1.2. Motion Estimation of a Repetition Pattern Region 4 1.3. Motion-Compensated Frame Interpolation 5 1.4. Video Frame Interpolation with Deep CNN 6 1.5. Outline of the Thesis 7 Chapter 2. Previous Works 9 2.1. Previous Works on Hierarchical Block-Based Motion Estimation 9 2.1.1.โ€‚Maximum a Posterior (MAP) Framework 10 2.1.2.Hierarchical Motion Estimation 12 2.2. Previous Works on Motion Estimation for a Repetition Pattern Region 13 2.3. Previous Works on Motion Compensation 14 2.4. Previous Works on Video Frame Interpolation with Deep CNN 16 Chapter 3. Hierarchical Motion Estimation for Small Objects 19 3.1. Problem Statement 19 3.2. The Alternative Motion Vector of High Cost Pixels 20 3.3. Modified Hierarchical Motion Estimation 23 3.4. Framework of the Proposed Algorithm 24 3.5. Experimental Results 25 3.5.1. Performance Analysis 26 3.5.2. Performance Evaluation 29 Chapter 4. Semi-Global Accurate Motion Estimation for a Repetition Pattern Region 32 4.1. Problem Statement 32 4.2. Objective Function and Constrains 33 4.3. Elector based Voting System 34 4.4. Voter based Voting System 36 4.5. Experimental Results 40 Chapter 5. Multiple Motion Vectors based Motion Compensation 44 5.1. Problem Statement 44 5.2. Adaptive Weighted Multiple Motion Vectors based Motion Compensation 45 5.2.1. One-to-Multiple Motion Vector Projection 45 5.2.2. A Comprehensive Metric as the Extension of Distance 48 5.3. Handling Hole Blocks 49 5.4. Framework of the Proposed Motion Compensated Frame Interpolation 50 5.5. Experimental Results 51 Chapter 6. Video Frame Interpolation with a Stack of Deep CNN 56 6.1. Problem Statement 56 6.2. The Proposed Network for Video Frame Interpolation 57 6.2.1. A Stack of Synthesis Networks 57 6.2.2. Intermediate Optical Flow Derivation Module 60 6.2.3. Warping Operations 62 6.2.4. Training and Loss Function 63 6.2.5. Network Architecture 64 6.2.6. Experimental Results 64 6.2.6.1. Frame Interpolation Evaluation 64 6.2.6.2. Ablation Experiments 77 6.3. Extension for Quality Enhancement for Compressed Videos Task 83 6.4. Extension for Improving the Coding Efficiency of HEVC based Low Bitrate Encoder 88 Chapter 7. Conclusion 94 References 97Docto

    Light Field Denoising via Anisotropic Parallax Analysis in a CNN Framework

    Full text link
    Light field (LF) cameras provide perspective information of scenes by taking directional measurements of the focusing light rays. The raw outputs are usually dark with additive camera noise, which impedes subsequent processing and applications. We propose a novel LF denoising framework based on anisotropic parallax analysis (APA). Two convolutional neural networks are jointly designed for the task: first, the structural parallax synthesis network predicts the parallax details for the entire LF based on a set of anisotropic parallax features. These novel features can efficiently capture the high frequency perspective components of a LF from noisy observations. Second, the view-dependent detail compensation network restores non-Lambertian variation to each LF view by involving view-specific spatial energies. Extensive experiments show that the proposed APA LF denoiser provides a much better denoising performance than state-of-the-art methods in terms of visual quality and in preservation of parallax details

    Multi-Frame Quality Enhancement for Compressed Video

    Full text link
    The past few years have witnessed great success in applying deep learning to enhance the quality of compressed image/video. The existing approaches mainly focus on enhancing the quality of a single frame, ignoring the similarity between consecutive frames. In this paper, we investigate that heavy quality fluctuation exists across compressed video frames, and thus low quality frames can be enhanced using the neighboring high quality frames, seen as Multi-Frame Quality Enhancement (MFQE). Accordingly, this paper proposes an MFQE approach for compressed video, as a first attempt in this direction. In our approach, we firstly develop a Support Vector Machine (SVM) based detector to locate Peak Quality Frames (PQFs) in compressed video. Then, a novel Multi-Frame Convolutional Neural Network (MF-CNN) is designed to enhance the quality of compressed video, in which the non-PQF and its nearest two PQFs are as the input. The MF-CNN compensates motion between the non-PQF and PQFs through the Motion Compensation subnet (MC-subnet). Subsequently, the Quality Enhancement subnet (QE-subnet) reduces compression artifacts of the non-PQF with the help of its nearest PQFs. Finally, the experiments validate the effectiveness and generality of our MFQE approach in advancing the state-of-the-art quality enhancement of compressed video. The code of our MFQE approach is available at https://github.com/ryangBUAA/MFQE.gitComment: to appear in CVPR 201
    • โ€ฆ
    corecore