117 research outputs found

    Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

    Full text link
    Stroke-based rendering aims to recreate an image with a set of strokes. Most existing methods render complex images using an uniform-block-dividing strategy, which leads to boundary inconsistency artifacts. To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions. We start from an empty canvas and divide the painting process into several steps. At each step, a compositor network trained with a phasic RL strategy first predicts the next painting region, then a painter network trained with a WGAN discriminator predicts stroke parameters, and a stroke renderer paints the strokes onto the painting region of the current canvas. Moreover, we extend our method to stroke-based style transfer with a novel differentiable distance transform loss, which helps preserve the structure of the input image during stroke-based stylization. Extensive experiments show our model outperforms the existing models in both stroke-based neural painting and stroke-based stylization. Code is available at https://github.com/sjtuplayer/Compositional_Neural_PainterComment: ACM MM 202

    Separable Image Warping with Spatial Lookup Tables

    Get PDF
    Image warping refers to the 2-D resampling of a source image onto a target image. In the general case, this requires costly 2-D filtering operations. Simplifications are possible when the warp can be expressed as a cascade of orthogonall-D transformations. In these cases, separable transformations have been introduced to realize large performance gains. The central ideas in this area were formulated in the 2-pass algorithm by Catmull and Smith. Although that method applies over an important class of transformations, there are intrinsic problems which limit its usefulness. The goal of this work is to extend the 2-pass approach to handle arbitrary spatial mapping functions. We address the difficulties intrinsic to 2-pass scanline algorithms: bottlenecking, foldovers, and the lack of closed-form inverse solutions. These problems are shown to be resolved in a general, efficient, separable technique, with graceful degradation for transformations of increasing complexity

    AN EFFECTIVE CACHE FOR THE ANYWHERE PIXEL ROUTER

    Get PDF
    Designing hardware to output pixels for light field displays or multi-projector systems is challenging owing to the memory bandwidth and speed of the application. A new technique of hardware that implements ‗anywhere pixel routing‘ was designed earlier at the University of Kentucky. This technique uses hardware to route pixels from input to output based upon a Look up Table (LUT). The initial design suffered from high memory latency due to random accesses to the DDR SDRAM input buffer. This thesis presents a cache design that alleviates the memory latency issue by reducing the number of random SDRAM accesses. The cache is implemented in the block RAM of a field programmable gate array (FPGA). A number of simulations are conducted to find an efficient cache. It is found that the cache takes only a few kilobits, about 7% of the block RAM and on an average speeds up the memory accesses by 20-30%

    Toward General Purpose 3D User Interfaces: Extending Windowing Systems to Three Dimensions

    Get PDF
    Recent growth in the commercial availability of consumer grade 3D user interface devices like the Microsoft Kinect and the Oculus Rift, coupled with the broad availability of high performance 3D graphics hardware, has put high quality 3D user interfaces firmly within the reach of consumer markets for the first time ever. However, these devices require custom integration with every application which wishes to use them, seriously limiting application support, and there is no established mechanism for multiple applications to use the same 3D interface hardware simultaneously. This thesis proposes that these problems can be solved in the same way that the same problems were solved for 2D interfaces: by abstracting the input hardware behind input primitives provided by the windowing system and compositing the output of applications within the windowing system before displaying it. To demonstrate the feasibility of this approach this thesis also presents a novel Wayland compositor which allows clients to create 3D interface contexts within a 3D interface space in the same way that traditional windowing systems allow applications to create 2D interface contexts (windows) within a 2D interface space (the desktop), as well as allowing unmodified 2D Wayland clients to window into the same 3D interface space and receive standard 2D input events. This implementation demonstrates the ability of consumer 3D interface hardware to support a 3D windowing system, the ability of this 3D windowing system to support applications with compelling 3D interfaces, the ability of this style of windowing system to be built on top of existing hardware accelerated graphics and windowing infrastructure, and the ability of such a windowing system to support unmodified 2D interface applications windowing into the same 3D windowing space as the 3D interface applications. This means that application developers could create compelling 3D interfaces with no knowledge of the hardware that supports them, that new hardware could be introduced without needing to integrate it with individual applications, and that users could mix whatever 2D and 3D applications they wish in an immersive 3D interface space regardless of the details of the underlying hardware

    Bi & tri dimensional scene description and composition in the MPEG-4 standard

    Get PDF
    MPEG-4 is a new ISO/IEC standard being developed by MPEG (Moving Picture Experts Group). The standard is to be released in November 1998 and version 1 will be an International Standard in January 1999 The MPEG-4 standard addresses the new demands that arise in a world in which more and more audio-visual material is exchanged in digital form MPEG-4 addresses the coding of objects of various types. Not only traditional video and audio frames, but also natural video and audio objects as well as textures, text, 2- and 3-dimensional graphic primitives, and synthetic music and sound effects. Using MPEG-4 to reconstruct an audio-visual scene at a terminal, it is hence no longer sufficient to encode the raw audio-visual data and transmit it, as MPEG-2 does m order to synchronize video and audio. In MPEG-4, all objects are multiplexed together at the encoder and transported to the terminal Once de-multiplexed, these objects are composed at the terminal to construct and present to the end user a meaningful audio-visual scene. The placement of these elementary audio-visual objects in space and time is described in the scene description of a scene. While the action of putting these objects together in the same representation space is the composition of audio-visual objects. My research was concerned with the scene description and composition of the audio-visual objects that are defined in an audio-visual scene Scene descriptions are coded independently irom sticams related to primitive audio-visual objects. The set of parameters belonging to the scene description are differentiated from the parameters that are used to improve the coding efficiency of an object. While the independent coding of different objects may achieve a higher compression rate, it also brings the ability to manipulate content at the terminal. This allows the modification of the scene description parameters without having to decode the primitive audio-visual objects themselves. This approach allows the development of a syntax that describes the spatio-temporal relationships of audio-visual scene objects. The behaviours of objects and their response to user inputs can thus also be represented in the scene description, allowing richer audio-visual content to be delivered as an MPEG-4 stream

    Design and analysis of a two-dimensional camera array

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 153-158).I present the design and analysis of a two-dimensional camera array for virtual studio applications. It is possible to substitute conventional cameras and motion control devices with a real-time, light field camera array. I discuss a variety of camera architectures and describe a prototype system based on the "finite-viewpoints" design that allows multiple viewers to navigate virtual cameras in a dynamically changing light field captured in real time. The light field camera consists of 64 commodity video cameras connected to off-the-shelf computers. I employ a distributed rendering algorithm that overcomes the data bandwidth problems inherent in capturing light fields by selectively transmitting only those portions of the video streams that contribute to the desired virtual view. I also quantify the capabilities of a virtual camera rendered from a camera array in terms of the range of motion, range of rotation, and effective resolution. I compare these results to other configurations. From this analysis I provide a method for camera array designers to select and configure cameras to meet desired specifications. I demonstrate the system and the conclusions of the analysis with a number of examples that exploit dynamic light fields.by Jason Chieh-Sheng Yang.Ph.D

    Implementation of a sort-last volume rendering using 3D textures

    Get PDF
    La tesi consiste in una estensione della libreria grafica Aura (licenza gpl, sviluppata dalla Vrije Universiteit di Amsterdam) aggiungendo i componenti necessari ad effettuare un volume rendering distribuito secondo il paradigma sort-last ed usando le texture 3D. Il programma è stato testato su un cluster di 9 PC

    Computer Generation of Integral Images using Interpolative Shading Techniques

    Get PDF
    Research to produce artificial 3D images that duplicates the human stereovision has been ongoing for hundreds of years. What has taken millions of years to evolve in humans is proving elusive even for present day technological advancements. The difficulties are compounded when real-time generation is contemplated. The problem is one of depth. When perceiving the world around us it has been shown that the sense of depth is the result of many different factors. These can be described as monocular and binocular. Monocular depth cues include overlapping or occlusion, shading and shadows, texture etc. Another monocular cue is accommodation (and binocular to some extent) where the focal length of the crystalline lens is adjusted to view an image. The important binocular cues are convergence and parallax. Convergence allows the observer to judge distance by the difference in angle between the viewing axes of left and right eyes when both are focussing on a point. Parallax relates to the fact that each eye sees a slightly shifted view of the image. If a system can be produced that requires the observer to use all of these cues, as when viewing the real world, then the transition to and from viewing a 3D display will be seamless. However, for many 3D imaging techniques, which current work is primarily directed towards, this is not the case and raises a serious issue of viewer comfort. Researchers worldwide, in university and industry, are pursuing their approaches in the development of 3D systems, and physiological disturbances that can cause nausea in some observers will not be acceptable. The ideal 3D system would require, as minimum, accurate depth reproduction, multiviewer capability, and all-round seamless viewing. The necessity not to wear stereoscopic or polarising glasses would be ideal and lack of viewer fatigue essential. Finally, for whatever the use of the system, be it CAD, medical, scientific visualisation, remote inspection etc on the one hand, or consumer markets such as 3D video games and 3DTV on the other, the system has to be relatively inexpensive. Integral photography is a ‘real camera’ system that attempts to comply with this ideal; it was invented in 1908 but due to technological reasons was not capable of being a useful autostereoscopic system. However, more recently, along with advances in technology, it is becoming a more attractive proposition for those interested in developing a suitable system for 3DTV. The fast computer generation of integral images is the subject of this thesis; the adjective ‘fast’ being used to distinguish it from the much slower technique of ray tracing integral images. These two techniques are the standard in monoscopic computer graphics whereby ray tracing generates photo-realistic images and the fast forward geometric approach that uses interpolative shading techniques is the method used for real-time generation. Before this present work began it was not known if it was possible to create volumetric integral images using a similar fast approach as that employed by standard computer graphics, but it soon became apparent that it would be successful and hence a valuable contribution in this area. Presented herein is a full description of the development of two derived methods for producing rendered integral image animations using interpolative shading. The main body of the work is the development of code to put these methods into practice along with many observations and discoveries that the author came across during this task.The Defence and Research Agency (DERA), a contract (LAIRD) under the European Link/EPSRC photonics initiative, and DTI/EPSRC sponsorship within the PROMETHEUS project

    A CONTROL MECHANISM TO THE ANYWHERE PIXEL ROUTER

    Get PDF
    Traditionally large format displays have been achieved using software. A new technique of using hardware based anywhere pixel routing is explored in this thesis. Information stored in a Look Up Table (LUT) in the hardware can be used to tile two image streams to produce a seamless image display. This thesis develops a 1 input-image 1 output-image system that implements arbitrary image warping on the image, based a LUT stored in memory. The developed system control mechanism is first validated using simulation results. It is next validated via implementation to a Field Programmable Gate Array (FPGA) based hardware prototype and appropriate experimental testing. It was validated by changing the contents of the LUT and observing that the resulting changes on the pixel mapping were always correct
    corecore