7 research outputs found

    BUILDUP: interactive creation of urban scenes from large photo collections

    Get PDF
    We propose a system for creating images of urban scenes composed of the large structures typical in such environments. Our system provides the user with a precomputed library of image-based 3D objects, such as roads, sidewalks and buildings, obtained from a large collection of photographs. When the user picks the 3D location of a new object to insert, the system retrieves objects that have all the required properties (location, orientation and lighting). Then, the user interface guides the user to add more objects enabling non-experts to make a new composition in a fast and intuitive way. Unlike prior work, the entire image composition process is done in the 3D space of the scene, therefore inconsistent scale or perspective distortion does not arise, and occlusions are properly handled

    Internet visual media processing: A survey with graphics and vision applications

    Get PDF
    In recent years, the computer graphics and computer vision communities have devoted significant attention to research based on Internet visual media resources. The huge number of images and videos continually being uploaded by millions of people have stimulated a variety of visual media creation and editing applications, while also posing serious challenges of retrieval, organization, and utilization. This article surveys recent research as regards processing of large collections of images and video, including work on analysis, manipulation, and synthesis. It discusses the problems involved, and suggests possible future directions in this emerging research area

    Towards object-based image editing

    Get PDF

    Accurate and discernible photocollages

    Get PDF
    There currently exist several techniques for selecting and combining images from a digital image library into a single image so that the result meets certain prespecified visual criteria. Image mosaic methods, first explored by Connors and Trivedi[18], arrange library images according to some tiling arrangement, often a regular grid, so that the combination of images, when viewed as a whole, resembles some input target image. Other techniques, such as Autocollage of Rother et al.[78], seek only to combine images in an interesting and visually pleasing manner, according to certain composition principles, without attempting to approximate any target image. Each of these techniques provide a myriad of creative options for artists who wish to combine several levels of meaning into a single image or who wish to exploit the meaning and symbolism contained in each of a large set of images through an efficient and easy process. We first examine the most notable and successful of these methods, and summarize the advantages and limitations of each. We then formulate a set of goals for an image collage system that combines the advantages of these methods while addressing and mitigating the drawbacks. Particularly, we propose a system for creating photocollages that approximate a target image as an aggregation of smaller images, chosen from a large library, so that interesting visual correspondences between images are exploited. In this way, we allow users to create collages in which multiple layers of meaning are encoded, with meaningful visual links between each layer. In service of this goal, we ensure that the images used are as large as possible and are combined in such a way that boundaries between images are not immediately apparent, as in Autocollage. This has required us to apply a multiscale approach to searching and comparing images from a large database, which achieves both speed and accuracy. We also propose a new framework for color post-processing, and propose novel techniques for decomposing images according to object and texture information

    Content based image synthesis

    No full text
    Abstract. A new method allowing for semantically guided image editing and synthesis is introduced. The editing process is made considerably easier and more powerful with our content-aware tool. We construct a database of image regions annotated with a carefully chosen vocabulary and utilize recent advances in texture synthesis algorithms to generate new and unique image regions from this database of material. These new regions are then seamlessly composited into a user’s existing photograph. The goal is to empower the end user with the ability to edit existing photographs and synthesize new ones on a high semantic level. Plausible results are generated using a small prototype database and showcase some of the editing possibilities that such a system affords. 1 Introduction and Related Work The ongoing digital media revolution has resulted in an untold amount of new digital content in the form of images and videos. Much of this digital content is generated by capturing real scenarios using cameras. We feel that digital media production would become much simpler, more effective, and cheaper if intuitive tools were available fo

    CONTROLLABLE CONTENT BASED IMAGE SYNTHESIS AND IMAGE RETRIEVAL

    Get PDF
    In this thesis, we address the problem of returning target images that match user queries in image retrieval and image synthesis. We investigate line drawing sketch as the main query, and explore several additional signals from the users that can helps clarify the type of images they are looking for. These additional queries may be expressed in one of the following two convenient forms: 1. visual content (sketch, scribble, texture patch); 2. language content. For image retrieval, we first look at the problem of sketch based image retrieval. We construct cross-domain networks that embed a user query and a target image into a shared feature space. We collected Sketchy Database; a large-scale dataset of matching sketch and image pairs that can be used as training data. The dataset has been made publicly available, and has become one of the few standard benchmarks for sketch-based image retrieval. To incorporate both sketch and language content as a queries, we propose a late-fusion dual-encoder approach, similar to CLIP; a recent successful work on vision and language representation learning. We also collected the dataset of 5,000 hand drawn sketch, which can be combined with existing COCO caption annotation to evaluate the task of image retrieval with sketch and language. For image synthesis, we present a general framework that allows users to interactively control the generated images based on specification of visual features (e.g., shape, color, texture).Ph.D
    corecore