Search CORE

1,827 research outputs found

A Unified Pyramid Recurrent Network for Video Frame Interpolation

Author: Chen Jie
Chen Youxin
Hahm Cheul-hee
Jin Xin
Koo Jayoon
Wu Longhai
Publication venue
Publication date: 07/11/2022
Field of study

Flow-guide synthesis provides a common framework for frame interpolation, where optical flow is typically estimated by a pyramid network, and then leveraged to guide a synthesis network to generate intermediate frames between input frames. In this paper, we present UPR-Net, a novel Unified Pyramid Recurrent Network for frame interpolation. Cast in a flexible pyramid framework, UPR-Net exploits lightweight recurrent modules for both bi-directional flow estimation and intermediate frame synthesis. At each pyramid level, it leverages estimated bi-directional flow to generate forward-warped representations for frame synthesis; across pyramid levels, it enables iterative refinement for both optical flow and intermediate frame. In particular, we show that our iterative synthesis can significantly improve the robustness of frame interpolation on large motion cases. Despite being extremely lightweight (1.7M parameters), UPR-Net achieves excellent performance on a large range of benchmarks. Code will be available soon.Comment: arXiv admin note: text overlap with arXiv:2206.08572 by other author

arXiv.org e-Print Archive

Performance of Wavelet-based Multiresolution Motion Estimation for Inbetweeningin Old Animated Films

Author: Hariadi Mochamad
Purnomo Mauridhi Hery
Sulistyaningrum Dwi Ratna
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 01/12/2012
Field of study

This paper investigates the performance of wavelet-based multiresolution motion estimation (MRME) for inbetweening in old animated films using three different MRME schemes. The three schemes are: coarse-to fine with a wavelet-based MRME, one of Zhang's MRMEs, and an MRME in the spatial domain. In order to make a performance comparison of these MRME schemes, two video sequences were used for a simulation. The experimental results show that the coarse-to-fine method performed better than Zhang's MRME and the MRME in the spatial domain. The evaluation results on block size 9x9 indicate that the coarse-to-fine method had an average peak signal-to-noise ratio (PSNR) of 23.48 dB for the first sequence and 29.84 for the second sequence

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal

The curvelet transform for image denoising

Author: Candès Emmanuel J.
Donoho David L.
Starck Jean-Luc
Publication venue
Publication date: 01/01/2002
Field of study

We describe approximate digital implementations of two new mathematical transforms, namely, the ridgelet transform and the curvelet transform. Our implementations offer exact reconstruction, stability against perturbations, ease of implementation, and low computational complexity. A central tool is Fourier-domain computation of an approximate digital Radon transform. We introduce a very simple interpolation in the Fourier space which takes Cartesian samples and yields samples on a rectopolar grid, which is a pseudo-polar sampling set based on a concentric squares geometry. Despite the crudeness of our interpolation, the visual performance is surprisingly good. Our ridgelet transform applies to the Radon transform a special overcomplete wavelet pyramid whose wavelets have compact support in the frequency domain. Our curvelet transform uses our ridgelet transform as a component step, and implements curvelet subbands using a filter bank of a&grave; trous wavelet filters. Our philosophy throughout is that transforms should be overcomplete, rather than critically sampled. We apply these digital transforms to the denoising of some standard images embedded in white noise. In the tests reported here, simple thresholding of the curvelet coefficients is very competitive with "state of the art" techniques based on wavelets, including thresholding of decimated or undecimated wavelet transforms and also including tree-based Bayesian posterior mean methods. Moreover, the curvelet reconstructions exhibit higher perceptual quality than wavelet-based reconstructions, offering visually sharper images and, in particular, higher quality recovery of edges and of faint linear and curvilinear features. Existing theory for curvelet and ridgelet transforms suggests that these new approaches can outperform wavelet methods in certain image reconstruction problems. The empirical results reported here are in encouraging agreement

CiteSeerX

Caltech Authors

H-VFI: Hierarchical Frame Interpolation for Videos with Large Motions

Author: Li Changlin
Sun Yanan
Tai Yu-Wing
Tang Chi-Keung
Tao Xin
Wu Guangyang
Publication venue
Publication date: 21/11/2022
Field of study

Capitalizing on the rapid development of neural networks, recent video frame interpolation (VFI) methods have achieved notable improvements. However, they still fall short for real-world videos containing large motions. Complex deformation and/or occlusion caused by large motions make it an extremely difficult problem in video frame interpolation. In this paper, we propose a simple yet effective solution, H-VFI, to deal with large motions in video frame interpolation. H-VFI contributes a hierarchical video interpolation transformer (HVIT) to learn a deformable kernel in a coarse-to-fine strategy in multiple scales. The learnt deformable kernel is then utilized in convolving the input frames for predicting the interpolated frame. Starting from the smallest scale, H-VFI updates the deformable kernel by a residual in succession based on former predicted kernels, intermediate interpolated results and hierarchical features from transformer. Bias and masks to refine the final outputs are then predicted by a transformer block based on interpolated results. The advantage of such a progressive approximation is that the large motion frame interpolation problem can be decomposed into several relatively simpler sub-tasks, which enables a very accurate prediction in the final results. Another noteworthy contribution of our paper consists of a large-scale high-quality dataset, YouTube200K, which contains videos depicting a great variety of scenarios captured at high resolution and high frame rate. Extensive experiments on multiple frame interpolation benchmarks validate that H-VFI outperforms existing state-of-the-art methods especially for videos with large motions

arXiv.org e-Print Archive

RAI-Net: Range-Adaptive LiDAR Point Cloud Frame Interpolation Network

Author: Chen Jianwen
Guo Xuezhou
Lin Xuhu
Wang Wenyi
Yin Qian
Zhao Lili
Zhu Zezhi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

LiDAR point cloud frame interpolation, which synthesizes the intermediate frame between the captured frames, has emerged as an important issue for many applications. Especially for reducing the amounts of point cloud transmission, it is by predicting the intermediate frame based on the reference frames to upsample data to high frame rate ones. However, due to high-dimensional and sparse characteristics of point clouds, it is more difficult to predict the intermediate frame for LiDAR point clouds than videos. In this paper, we propose a novel LiDAR point cloud frame interpolation method, which exploits range images (RIs) as an intermediate representation with CNNs to conduct the frame interpolation process. Considering the inherited characteristics of RIs differ from that of color images, we introduce spatially adaptive convolutions to extract range features adaptively, while a high-efficient flow estimation method is presented to generate optical flows. The proposed model then warps the input frames and range features, based on the optical flows to synthesize the interpolated frame. Extensive experiments on the KITTI dataset have clearly demonstrated that our method consistently achieves superior frame interpolation results with better perceptual quality to that of using state-of-the-art video frame interpolation methods. The proposed method could be integrated into any LiDAR point cloud compression systems for inter prediction.Comment: Accepted by the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting 202

arXiv.org e-Print Archive

Western Sydney ResearchDirect