33,644 research outputs found

    A system for image-based modeling and photo editing

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Architecture, 2002.Includes bibliographical references (p. 169-178).Traditionally in computer graphics, a scene is represented by geometric primitives composed of various materials and a collection of lights. Recently, techniques for modeling and rendering scenes from a set of pre-acquired images have emerged as an alternative approach, known as image-based modeling and rendering. Much of the research in this field has focused on reconstructing and rerendering from a set of photographs, while little work has been done to address the problem of editing and modifying these scenes. On the other hand, photo-editing systems, such as Adobe Photoshop, provide a powerful, intuitive, and practical means to edit images. However, these systems are limited by their two-dimensional nature. In this thesis, we present a system that extends photo editing to 3D. Starting from a single input image, the system enables the user to reconstruct a 3D representation of the captured scene, and edit it with the ease and versatility of 2D photo editing. The scene is represented as layers of images with depth, where each layer is an image that encodes both color and depth. A suite of user-assisted tools are employed, based on a painting metaphor, to extract layers and assign depths. The system enables editing from different viewpoints, extracting and grouping of image-based objects, and modifying the shape, color, and illumination of these objects. As part of the system, we introduce three powerful new editing tools. These include two new clone brushing tools: the non-distorted clone brush and the structure-preserving clone brush. They permit copying of parts of an image to another via a brush interface, but alleviate distortions due to perspective foreshortening and object geometry.(cont.) The non-distorted clone brush works on arbitrary 3D geometry, while the structure-preserving clone brush, a 2D version, assumes a planar surface, but has the added advantage of working directly in 2D photo-editing systems that lack depth information. The third tool, a texture-illuminance decoupling filter, discounts the effect of illumination on uniformly textured areas by decoupling large- and small-scale features via bilateral filtering. This tool is crucial for relighting and changing the materials of the scene. There are many applications for such a system, for example architectural, lighting and landscape design, entertainment and special effects, games, and virtual TV sets. The system allows the user to superimpose scaled architectural models into real environments, or to quickly paint a desired lighting scheme of an interior, while being able to navigate within the scene for a fully immersive 3D experience. We present examples and results of complex architectural scenes, 360-degree panoramas, and even paintings, where the user can change viewpoints, edit the geometry and materials, and relight the environment.by Byong Mok Oh.Ph.D

    SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

    Full text link
    Despite the great success in 2D editing using user-friendly tools, such as Photoshop, semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either relying on 3D modeling skills or allowing editing within only a few categories. In this paper, we present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image, and faithfully delivers edited novel views with high fidelity and multi-view consistency. To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space, and develop a series of techniques to aid the editing process, including cyclic constraints with a proxy mesh to facilitate geometric supervision, a color compositing mechanism to stabilize semantic-driven texture editing, and a feature-cluster-based regularization to preserve the irrelevant content unchanged. Extensive experiments and editing examples on both real-world and synthetic data demonstrate that our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes. Our project webpage: https://zju3dv.github.io/sine/.Comment: Accepted to CVPR 2023. Project Page: https://zju3dv.github.io/sine

    Calipso: Physics-based Image and Video Editing through CAD Model Proxies

    Get PDF
    We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In Calipso, the user makes edits directly in 3D; these edits are processed by the simulation and then transfered to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate Calipso's physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.Comment: 11 page

    Creating virtual models from uncalibrated camera views

    Get PDF
    The reconstruction of photorealistic 3D models from camera views is becoming an ubiquitous element in many applications that simulate physical interaction with the real world. In this paper, we present a low-cost, interactive pipeline aimed at non-expert users, that achieves 3D reconstruction from multiple views acquired with a standard digital camera. 3D models are amenable to access through diverse representation modalities that typically imply trade-offs between level of detail, interaction, and computational costs. Our approach allows users to selectively control the complexity of different surface regions, while requiring only simple 2D image editing operations. An initial reconstruction at coarse resolution is followed by an iterative refining of the surface areas corresponding to the selected regions

    HeadOn: Real-time Reenactment of Human Portrait Videos

    Get PDF
    We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1

    High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

    Full text link
    We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048x1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.Comment: v2: CVPR camera ready, adding more results for edge-to-photo example
    corecore