10 research outputs found

    StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer

    Full text link
    Our paper seeks to transfer the hairstyle of a reference image to an input photo for virtual hair try-on. We target a variety of challenges scenarios, such as transforming a long hairstyle with bangs to a pixie cut, which requires removing the existing hair and inferring how the forehead would look, or transferring partially visible hair from a hat-wearing person in a different pose. Past solutions leverage StyleGAN for hallucinating any missing parts and producing a seamless face-hair composite through so-called GAN inversion or projection. However, there remains a challenge in controlling the hallucinations to accurately transfer hairstyle and preserve the face shape and identity of the input. To overcome this, we propose a multi-view optimization framework that uses "two different views" of reference composites to semantically guide occluded or ambiguous regions. Our optimization shares information between two poses, which allows us to produce high fidelity and realistic results from incomplete references. Our framework produces high-quality results and outperforms prior work in a user study that consists of significantly more challenging hair transfer scenarios than previously studied. Project page: https://stylegan-salon.github.io/.Comment: Accepted to CVPR202

    Informative Features for Model Comparison

    Get PDF
    Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201

    TextureGAN: Controlling Deep Image Synthesis with Texture Patches

    Full text link
    In this paper, we investigate deep image synthesis guided by sketch, color, and texture. Previous image synthesis methods can be controlled by sketch and color strokes but we are the first to examine texture control. We allow a user to place a texture patch on a sketch at arbitrary locations and scales to control the desired output texture. Our generative network learns to synthesize objects consistent with these texture suggestions. To achieve this, we develop a local texture loss in addition to adversarial and content loss to train the generative network. We conduct experiments using sketches generated from real images and textures sampled from a separate texture database and results show that our proposed algorithm is able to generate plausible images that are faithful to user controls. Ablation studies show that our proposed pipeline can generate more realistic images than adapting existing methods directly.Comment: CVPR 2018 spotligh

    Kernel Mean Matching for Content Addressability of GANs

    Get PDF
    We propose a novel procedure which adds "content-addressability" to any given unconditional implicit model e.g., a generative adversarial network (GAN). The procedure allows users to control the generative process by specifying a set (arbitrary size) of desired examples based on which similar samples are generated from the model. The proposed approach, based on kernel mean matching, is applicable to any generative models which transform latent vectors to samples, and does not require retraining of the model. Experiments on various high-dimensional image generation problems (CelebA-HQ, LSUN bedroom, bridge, tower) show that our approach is able to generate images which are consistent with the input set, while retaining the image quality of the original model. To our knowledge, this is the first work that attempts to construct, at test time, a content-addressable generative model from a trained marginal model.Comment: Wittawat Jitkrittum and Patsorn Sangkloy contributed equally to this wor

    Generating Images Instead of Retrieving Them : Relevance Feedback on Generative Adversarial Networks

    Get PDF
    Finding images matching a user’s intention has been largely basedon matching a representation of the user’s information needs withan existing collection of images. For example, using an exampleimage or a written query to express the information need and re-trieving images that share similarities with the query or exampleimage. However, such an approach is limited to retrieving onlyimages that already exist in the underlying collection. Here, wepresent a methodology for generating images matching the userintention instead of retrieving them. The methodology utilizes arelevance feedback loop between a user and generative adversarialneural networks (GANs). GANs can generate novel photorealisticimages which are initially not present in the underlying collection,but generated in response to user feedback. We report experiments(N=29) where participants generate images using four differentdomains and various search goals with textual and image targets.The results show that the generated images match the tasks andoutperform images selected as baselines from a fixed image col-lection. Our results demonstrate that generating new informationcan be more useful for users than retrieving it from a collection ofexisting information.Peer reviewe

    Argoverse: 3D Tracking and Forecasting with Rich Maps

    Full text link
    We present Argoverse -- two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Argoverse was collected by a fleet of autonomous vehicles in Pittsburgh and Miami. The Argoverse 3D Tracking dataset includes 360 degree images from 7 cameras with overlapping fields of view, 3D point clouds from long range LiDAR, 6-DOF pose, and 3D track annotations. Notably, it is the only modern AV dataset that provides forward-facing stereo imagery. The Argoverse Motion Forecasting dataset includes more than 300,000 5-second tracked scenarios with a particular vehicle identified for trajectory forecasting. Argoverse is the first autonomous vehicle dataset to include "HD maps" with 290 km of mapped lanes with geometric and semantic metadata. All data is released under a Creative Commons license at www.argoverse.org. In our baseline experiments, we illustrate how detailed map information such as lane direction, driveable area, and ground height improves the accuracy of 3D object tracking and motion forecasting. Our tracking and forecasting experiments represent only an initial exploration of the use of rich maps in robotic perception. We hope that Argoverse will enable the research community to explore these problems in greater depth.Comment: CVPR 201

    CONTROLLABLE CONTENT BASED IMAGE SYNTHESIS AND IMAGE RETRIEVAL

    Get PDF
    In this thesis, we address the problem of returning target images that match user queries in image retrieval and image synthesis. We investigate line drawing sketch as the main query, and explore several additional signals from the users that can helps clarify the type of images they are looking for. These additional queries may be expressed in one of the following two convenient forms: 1. visual content (sketch, scribble, texture patch); 2. language content. For image retrieval, we first look at the problem of sketch based image retrieval. We construct cross-domain networks that embed a user query and a target image into a shared feature space. We collected Sketchy Database; a large-scale dataset of matching sketch and image pairs that can be used as training data. The dataset has been made publicly available, and has become one of the few standard benchmarks for sketch-based image retrieval. To incorporate both sketch and language content as a queries, we propose a late-fusion dual-encoder approach, similar to CLIP; a recent successful work on vision and language representation learning. We also collected the dataset of 5,000 hand drawn sketch, which can be combined with existing COCO caption annotation to evaluate the task of image retrieval with sketch and language. For image synthesis, we present a general framework that allows users to interactively control the generated images based on specification of visual features (e.g., shape, color, texture).Ph.D

    PaintsTorch: a User-Guided Anime Line Art Colorization Tool with Double Generator Conditional Adversarial Network

    No full text
    International audienceThe lack of information provided by line arts makes user guidedcolorization a challenging task for computer vision. Recent contributions from the deep learning community based on Generative Adversarial Network (GAN) have shown incredible results compared to previous techniques. These methods employ user input color hints as a way to condition the network. The current state of the art has shown the ability to generalize and generate realistic and precise colorization by introducing a custom dataset and a new model with its training pipeline. Nevertheless, their approach relies on randomly sampled pixels as color hints for training. Thus, in this contribution, we introduce a stroke simulation based approach for hint generation, making the model more robust to messy inputs. We also propose a new cleaner dataset, and explore the use of a double generator GAN to improve visual fidelity

    Computer Vision for Supporting Fashion Creative Processes

    No full text
    Computer vision techniques are powerful tools to support and enhance creative workflows in fashion industries. In many cases, designer sketches and drawings, made with pen or pencil on raw paper, are the starting point of a fashion workflow. Then, such hand-drawn sketches must be imported into a software, to convert the prototype into a real-world product. This leads to a first important problem, namely, the automatic vectorization of sketches. Moreover, the various outcomes of all creative processes consist of a large number of images, which depict a plethora of products, from clothing to footwear. Recognizing product characteristics and classifying them properly is crucial in order to avoid duplicates and support marketing campaigns. Each feature could eventually require a different method, spacing from segmentation, image retrieval, to machine learning techniques, such as deep learning. Some state-of-the-art techniques and a novel proposal for line extraction and thinning, applied to fashion sketches, are described. Newly-developed methods are presented and their effectiveness in the recognition of features is discussed
    corecore