17 research outputs found

    Transferring Image-based Edits for Multi-Channel Compositing

    Get PDF
    A common way to generate high-quality product images is to start with a physically-based render of a 3D scene, apply image-based edits on individual render channels, and then composite the edited channels together (in some cases, on top of a background photograph). This workflow requires users to manually select the right render channels, prescribe channel-specific masks, and set appropriate edit parameters. Unfortunately, such edits cannot be easily reused for global variations of the original scene, such as a rigid-body transformation of the 3D objects or a modified viewpoint, which discourages iterative refinement of both global scene changes and image-based edits. We propose a method to automatically transfer such user edits across variations of object geometry, illumination, and viewpoint. This transfer problem is challenging since many edits may be visually plausible but non-physical, with a successful transfer dependent on an unknown set of scene attributes that may include both photometric and non-photometric features. To address this challenge, we present a transfer algorithm that extends the image analogies formulation to include an augmented set of photometric and non-photometric guidance channels and, more importantly, adaptively estimate weights for the various candidate channels in a way that matches the characteristics of each individual edit. We demonstrate our algorithm on a variety of complex edit-transfer scenarios for creating high-quality product images

    Assistive visual content creation tools via multimodal correlation analysis

    Get PDF
    Visual imagery is ubiquitous in society and can take various formats: from 2D sketches and photographs to photorealistic 3D renderings and animations. The creation processes for each of these mediums have their own unique challenges and methodologies that artists need to overcome and master. For example, for an artist to depict a 3D scene in a 2D drawing they need to understand foreshortening effects to position and scale objects accurately on the page; or, when modeling 3D scenes, artists need to understand how light interacts with objects and materials, to achieve a desired appearance. Many of these tasks can be complex, time-consuming, and repetitive for content creators. The goal of this thesis is to develop tools to alleviate artists from some of these issues and to assist them in the creation process. The key hypothesis is that understanding the relationships between multiple signals present in the scene being created enables such assistive tools. This thesis proposes three assistive tools. First, we present an image degradation model for depth-augmented image editing to help evaluate the quality of the image manipulation. Second, we address the problem of teaching novices to draw objects accurately by automatically generating easy-to-follow sketching tutorials for arbitrary 3D objects. Finally, we propose a method to automatically transfer 2D parametric user edits made to rendered 3D scenes to global variations of the original scene

    Rewriting a Deep Generative Model

    Full text link
    A deep generative model such as a GAN learns to model a rich set of semantic and physical rules about the target distribution, but up to now, it has been obscure how such rules are encoded in the network, or how a rule could be changed. In this paper, we introduce a new problem setting: manipulation of specific rules encoded by a deep generative model. To address the problem, we propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory. We derive an algorithm for modifying one entry of the associative memory, and we demonstrate that several interesting structural rules can be located and modified within the layers of state-of-the-art generative models. We present a user interface to enable users to interactively change the rules of a generative model to achieve desired effects, and we show several proof-of-concept applications. Finally, results on multiple datasets demonstrate the advantage of our method against standard fine-tuning methods and edit transfer algorithms.Comment: ECCV 2020 (oral). Code at https://github.com/davidbau/rewriting. For videos and demos see https://rewriting.csail.mit.edu

    RAID: A relation-augmented image descriptor

    Get PDF
    As humans, we regularly interpret scenes based on how objects are related, rather than based on the objects themselves. For example, we see a person riding an object X or a plank bridging two objects. Current methods provide limited support to search for content based on such relations. We present RAID, a relation-augmented image descriptor that supports queries based on inter-region relations. The key idea of our descriptor is to encode region-to-region relations as the spatial distribution of point-to-region relationships between two image regions. RAID allows sketch-based retrieval and requires minimal training data, thus making it suited even for querying uncommon relations. We evaluate the proposed descriptor by querying into large image databases and successfully extract nontrivial images demonstrating complex inter-region relations, which are easily missed or erroneously classified by existing methods. We assess the robustness of RAID on multiple datasets even when the region segmentation is computed automatically or very noisy

    Consistent Video Filtering for Camera Arrays

    Get PDF
    International audienceVisual formats have advanced beyond single-view images and videos: 3D movies are commonplace, researchers have developed multi-view navigation systems, and VR is helping to push light field cameras to mass market. However, editing tools for these media are still nascent, and even simple filtering operations like color correction or stylization are problematic: naively applying image filters per frame or per view rarely produces satisfying results due to time and space inconsistencies. Our method preserves and stabilizes filter effects while being agnostic to the inner working of the filter. It captures filter effects in the gradient domain, then uses \emph{input} frame gradients as a reference to impose temporal and spatial consistency. Our least-squares formulation adds minimal overhead compared to naive data processing. Further, when filter cost is high, we introduce a filter transfer strategy that reduces the number of per-frame filtering computations by an order of magnitude, with only a small reduction in visual quality. We demonstrate our algorithm on several camera array formats including stereo videos, light fields, and wide baselines

    A generic tool for interactive complex image editing

    Get PDF
    Plenty of complex image editing techniques require certain per-pixel property or magnitude to be known, e.g., simulating depth of field effects requires a depth map. This work presents an efficient interaction paradigm that approximates any per-pixel magnitude from a few user strokes by propagating the sparse user input to each pixel of the image. The propagation scheme is based on a linear least-squares system of equations which represents local and neighboring restrictions over superpixels. After each user input, the system responds immediately, propagating the values and applying the corresponding filter. Our interaction paradigm is generic, enabling image editing applications to run at interactive rates by changing just the image processing algorithm, but keeping our proposed propagation scheme. We illustrate this through three interactive applications: depth of field simulation, dehazing and tone mapping

    Partial shape matching using transformation parameter similarity

    Get PDF
    In this paper, we present a method for non-rigid, partial shape matching in vector graphics. Given a user-specified query region in a 2D shape, similar regions are found, even if they are non-linearly distorted. Furthermore, a non-linear mapping is established between the query regions and these matches, which allows the automatic transfer of editing operations such as texturing. This is achieved by a two-step approach. First, pointwise correspondences between the query region and the whole shape are established. The transformation parameters of these correspondences are registered in an appropriate transformation space. For transformations between similar regions, these parameters form surfaces in transformation space, which are extracted in the second step of our method. The extracted regions may be related to the query region by a non-rigid transform, enabling non-rigid shape matching. In this paper, we present a method for non-rigid, partial shape matching in vector graphics. Given a user-specified query region in a 2D shape, similar regions are found, even if they are non-linearly distorted. Furthermore, a non-linear mapping is established between the query regions and these matches, which allows the automatic transfer of editing operations such as texturing. This is achieved by a two-step approach. First, pointwise correspondences between the query region and the whole shape are established. The transformation parameters of these correspondences are registered in an appropriate transformation space. For transformations between similar regions, these parameters form surfaces in transformation space, which are extracted in the second step of our method. The extracted regions may be related to the query region by a non-rigid transform, enabling non-rigid shape matching
    corecore