Search CORE

556 research outputs found

Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques

Author: Gowda Shashank Narayana
Gowda Shreyank N
Jin Xiaobo
Thakre Yash
Publication venue
Publication date: 31/01/2024
Field of study

This paper offers a comprehensive analysis of recent advancements in video inpainting techniques, a critical subset of computer vision and artificial intelligence. As a process that restores or fills in missing or corrupted portions of video sequences with plausible content, video inpainting has evolved significantly with the advent of deep learning methodologies. Despite the plethora of existing methods and their swift development, the landscape remains complex, posing challenges to both novices and established researchers. Our study deconstructs major techniques, their underpinning theories, and their effective applications. Moreover, we conduct an exhaustive comparative study, centering on two often-overlooked dimensions: visual quality and computational efficiency. We adopt a human-centric approach to assess visual quality, enlisting a panel of annotators to evaluate the output of different video inpainting techniques. This provides a nuanced qualitative understanding that complements traditional quantitative metrics. Concurrently, we delve into the computational aspects, comparing inference times and memory demands across a standardized hardware setup. This analysis underscores the balance between quality and efficiency: a critical consideration for practical applications where resources may be constrained. By integrating human validation and computational resource comparison, this survey not only clarifies the present landscape of video inpainting techniques but also charts a course for future explorations in this vibrant and evolving field

arXiv.org e-Print Archive

Large-Scale Light Field Capture and Reconstruction

Author: Gao Yuan
Publication venue: Universitatsbibliothek Kiel
Publication date: 01/01/2020
Field of study

This thesis discusses approaches and techniques to convert Sparsely-Sampled Light Fields (SSLFs) into Densely-Sampled Light Fields (DSLFs), which can be used for visualization on 3DTV and Virtual Reality (VR) devices. Exemplarily, a movable 1D large-scale light field acquisition system for capturing SSLFs in real-world environments is evaluated. This system consists of 24 sparsely placed RGB cameras and two Kinect V2 sensors. The real-world SSLF data captured with this setup can be leveraged to reconstruct real-world DSLFs. To this end, three challenging problems require to be solved for this system: (i) how to estimate the rigid transformation from the coordinate system of a Kinect V2 to the coordinate system of an RGB camera; (ii) how to register the two Kinect V2 sensors with a large displacement; (iii) how to reconstruct a DSLF from a SSLF with moderate and large disparity ranges. To overcome these three challenges, we propose: (i) a novel self-calibration method, which takes advantage of the geometric constraints from the scene and the cameras, for estimating the rigid transformations from the camera coordinate frame of one Kinect V2 to the camera coordinate frames of 12-nearest RGB cameras; (ii) a novel coarse-to-fine approach for recovering the rigid transformation from the coordinate system of one Kinect to the coordinate system of the other by means of local color and geometry information; (iii) several novel algorithms that can be categorized into two groups for reconstructing a DSLF from an input SSLF, including novel view synthesis methods, which are inspired by the state-of-the-art video frame interpolation algorithms, and Epipolar-Plane Image (EPI) inpainting methods, which are inspired by the Shearlet Transform (ST)-based DSLF reconstruction approaches

MACAU: Open Access Repository of Kiel University

Recommended from our members

A Variational Model for Joint Motion Estimation and Image Reconstruction

Author: Burger Martin
Dirks Hendrik
Schönlieb Carola-Bibiane
Publication venue: SIAM Journal on Imaging Sciences
Publication date: 01/01/2018
Field of study

The aim of this paper is to derive and analyze a variational model for the joint estimation of motion and reconstruction of image sequences, which is based on a time-continuous Eulerian motion model. The model can be set up in terms of the continuity equation or the brightness constancy equation. The analysis in this paper focuses on the latter for robust motion estimation on sequences of twodimensional images. We rigorously prove the existence of a minimizer in a suitable function space setting. Moreover, we discuss the numerical solution of the model based on primal-dual algorithms and investigate several examples. Finally, the benefits of our model compared to existing techniques, such as sequential image reconstruction and motion estimation, are shown.The work of the first author was also supported by the German Science Foundation DFG via EXC 1003 Cells in Motion Cluster of Excellence, M¨unster, German

Apollo (Cambridge)

S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation

Author: Chen Yurui
Li Wenye
Lu Jiachen
Xie Ziyang
Zhang Feihu
Zhang Junge
Zhang Li
Publication venue
Publication date: 03/02/2024
Field of study

Autonomous driving simulation system plays a crucial role in enhancing self-driving data and simulating complex and rare traffic scenarios, ensuring navigation safety. However, traditional simulation systems, which often heavily rely on manual modeling and 2D image editing, struggled with scaling to extensive scenes and generating realistic simulation data. In this study, we present S-NeRF++, an innovative autonomous driving simulation system based on neural reconstruction. Trained on widely-used self-driving datasets such as nuScenes and Waymo, S-NeRF++ can generate a large number of realistic street scenes and foreground objects with high rendering quality as well as offering considerable flexibility in manipulation and simulation. Specifically, S-NeRF++ is an enhanced neural radiance field for synthesizing large-scale scenes and moving vehicles, with improved scene parameterization and camera pose learning. The system effectively utilizes noisy and sparse LiDAR data to refine training and address depth outliers, ensuring high quality reconstruction and novel-view rendering. It also provides a diverse foreground asset bank through reconstructing and generating different foreground vehicles to support comprehensive scenario creation. Moreover, we have developed an advanced foreground-background fusion pipeline that skillfully integrates illumination and shadow effects, further enhancing the realism of our simulations. With the high-quality simulated data provided by our S-NeRF++, we found the perception methods enjoy performance boost on several autonomous driving downstream tasks, which further demonstrate the effectiveness of our proposed simulator

arXiv.org e-Print Archive

Learning Joint Spatial-Temporal Transformations for Video Inpainting

Author: A Criminisi
A Newson
C Barnes
DM Hausman
G Liu
J Johnson
JB Huang
KA Patwardhan
M Granados
W-S Lai
Y Matsushita
Y Wexler
Publication venue
Publication date: 20/07/2020
Field of study

High-quality video inpainting that completes missing regions in video frames is a promising yet challenging task. State-of-the-art approaches adopt attention models to complete a frame by searching missing contents from reference frames, and further complete whole videos frame by frame. However, these approaches can suffer from inconsistent attention results along spatial and temporal dimensions, which often leads to blurriness and temporal artifacts in videos. In this paper, we propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting. Specifically, we simultaneously fill missing regions in all input frames by self-attention, and propose to optimize STTN by a spatial-temporal adversarial loss. To show the superiority of the proposed model, we conduct both quantitative and qualitative evaluations by using standard stationary masks and more realistic moving object masks. Demo videos are available at https://github.com/researchmm/STTN.Comment: Accepted by ECCV202

arXiv.org e-Print Archive

Crossref