1,418 research outputs found
Spherical panorama compositing through depth estimation
In this paper, we propose to work in the 2.5D space of the scene to facilitate composition of new spherical panoramas. For
adding depths to spherical panoramas, we extend an existing method that was designed to estimate relative depths from a
single perspective image through user interaction.We analyze the difficulties to interactively provide such depth information
for spherical panoramas, through three different types of presentation. Then, we propose a set of basic tools to interactively
manage the relative depths of the panoramas in order to obtain a composition in a very simple way. We conclude that the
relative depths obtained by the extended depth estimationmethod are enough for the purpose of compositing newphotorealistic
panoramas through a few elementary editing tools.Funding for open access charge: CRUE-Universitat Jaume
A robust patch-based synthesis framework for combining inconsistent images
Current methods for combining different images produce visible artifacts when the sources have very different textures and structures, come from far view points, or capture dynamic scenes with motions. In this thesis, we propose a patch-based synthesis algorithm to plausibly combine different images that have color, texture, structural, and geometric inconsistencies. For some applications such as cloning and stitching where a gradual blend is required, we present a new method for synthesizing a transition region between two source images, such that inconsistent properties change gradually from one source to the other. We call this process image melding. For gradual blending, we generalized patch-based optimization foundation with three key generalizations: First, we enrich the patch search space with additional geometric and photometric transformations. Second, we integrate image gradients into the patch representation and replace the usual color averaging with a screened Poisson equation solver. Third, we propose a new energy based on mixed L2/L0 norms for colors and gradients that produces a gradual transition between sources without sacrificing texture sharpness. Together, all three generalizations enable patch-based solutions to a broad class of image melding problems involving inconsistent sources: object cloning, stitching challenging panoramas, hole filling from multiple photos, and image harmonization. We also demonstrate another application which requires us to address inconsistencies across the images: high dynamic range (HDR) reconstruction using sequential exposures. In this application, the results will suffer from objectionable artifacts for dynamic scenes if the inconsistencies caused by significant scene motions are not handled properly. In this thesis, we propose a new approach to HDR reconstruction that uses information in all exposures while being more robust to motion than previous techniques. Our algorithm is based on a novel patch-based energy-minimization formulation that integrates alignment and reconstruction in a joint optimization through an equation we call the HDR image synthesis equation. This allows us to produce an HDR result that is aligned to one of the exposures yet contains information from all of them. These two applications (image melding and high dynamic range reconstruction) show that patch based methods like the one proposed in this dissertation can address inconsistent images and could open the door to many new image editing applications in the future
A General Implicit Framework for Fast NeRF Composition and Rendering
A variety of Neural Radiance Fields (NeRF) methods have recently achieved
remarkable success in high render speed. However, current accelerating methods
are specialized and incompatible with various implicit methods, preventing
real-time composition over various types of NeRF works. Because NeRF relies on
sampling along rays, it is possible to provide general guidance for
acceleration. To that end, we propose a general implicit pipeline for composing
NeRF objects quickly. Our method enables the casting of dynamic shadows within
or between objects using analytical light sources while allowing multiple NeRF
objects to be seamlessly placed and rendered together with any arbitrary rigid
transformations. Mainly, our work introduces a new surface representation known
as Neural Depth Fields (NeDF) that quickly determines the spatial relationship
between objects by allowing direct intersection computation between rays and
implicit surfaces. It leverages an intersection neural network to query NeRF
for acceleration instead of depending on an explicit spatial structure.Our
proposed method is the first to enable both the progressive and interactive
composition of NeRF objects. Additionally, it also serves as a previewing
plugin for a range of existing NeRF works.Comment: 7 pages for main conten
A network transparent, retained mode multimedia processing framework for the Linux operating system environment
Die Arbeit präsentiert ein Multimedia-Framework für Linux, das im Unterschied zu früheren Arbeiten auf den Ideen "retained-mode processing" und "lazy evaluation" basiert: Statt Transformationen unmittelbar auszuführen, wird eine abstrakte Repräsentation aller Medienelemente aufgebaut. "renderer"-Treiber fungieren als Übersetzer, die diese Darstellung zur Laufzeit in konkrete Operationen umsetzen, wobei das Datenmodell zahlreiche Optimierungen zur Reduktion der Anzahl der Schritte oder der Minimierung von Kommunikation erlaubt. Dies erlaubt ein stark vereinfachtes Programmiermodell bei gleichzeitiger Effizienzsteigerung. "renderer"-Treiber können zur Ausführung von Transformationen den lokalen Prozessor verwenden, oder können die Operationen delegieren. In der Arbeit wird eine Erweiterung des X Window Systems um Mechanismen zur Medienverarbeitung vorgestellt, sowie ein "renderer"-Treiber, der diese zur Delegation der Verarbeitung nutzt
The Video Mesh: A Data Structure for Image-based Video Editing
This paper introduces the video mesh, a data structure for representing video as 2.5D "paper cutouts." The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. The user rotoscopes occluding contours and we introduce an algorithm to cut the video mesh along them. Object boundaries are refined with perpixel alpha values. The video mesh is at its core a set of texture mapped triangles, we leverage graphics hardware to enable interactive editing and rendering of a variety of effects. We demonstrate the effectiveness of our representation with a number of special effects including 3D viewpoint changes, object insertion, and depth-of-field manipulation
Static scene illumination estimation from video with applications
We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods require either user interaction or external information in the form of a database. We adopt structure-from-motion and multi-view stereo for initial scene reconstruction, and then estimate an environment map represented by spherical harmonics (as these perform better than other bases). We also demonstrate several video editing applications that exploit the recovered geometry and illumination, including object insertion (e.g., for augmented reality), shadow detection, and video relighting
- …