13 research outputs found
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Even though trained mainly on images, we discover that pretrained diffusion
models show impressive power in guiding sketch synthesis. In this paper, we
present DiffSketcher, an innovative algorithm that creates vectorized free-hand
sketches using natural language input. DiffSketcher is developed based on a
pre-trained text-to-image diffusion model. It performs the task by directly
optimizing a set of Bezier curves with an extended version of the score
distillation sampling (SDS) loss, which allows us to use a raster-level
diffusion model as a prior for optimizing a parametric vectorized sketch
generator. Furthermore, we explore attention maps embedded in the diffusion
model for effective stroke initialization to speed up the generation process.
The generated sketches demonstrate multiple levels of abstraction while
maintaining recognizability, underlying structure, and essential visual details
of the subject drawn. Our experiments show that DiffSketcher achieves greater
quality than prior work.Comment: 14 pages, 8 figures. update: improved experiment analysis, fixed
typos, and fixed image error
Adaptive image vectorisation and brushing using mesh colours
We propose the use of curved triangles and mesh colours as a vector primitive for image vectorisation. We show that our representation has clear benefits for rendering performance, texture detail, as well as further editing of the resulting vector images. The proposed method focuses on efficiency, but it still leads to results that compare favourably with those from previous work. We show results over a variety of input images ranging from photos, drawings, paintings, all the way to designs and cartoons. We implemented several editing workflows facilitated by our representation: interactive user-guided vectorisation, and novel raster-style feature-aware brushing capabilities
Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region
Stroke-based rendering aims to recreate an image with a set of strokes. Most
existing methods render complex images using an uniform-block-dividing
strategy, which leads to boundary inconsistency artifacts. To solve the
problem, we propose Compositional Neural Painter, a novel stroke-based
rendering framework which dynamically predicts the next painting region based
on the current canvas, instead of dividing the image plane uniformly into
painting regions. We start from an empty canvas and divide the painting process
into several steps. At each step, a compositor network trained with a phasic RL
strategy first predicts the next painting region, then a painter network
trained with a WGAN discriminator predicts stroke parameters, and a stroke
renderer paints the strokes onto the painting region of the current canvas.
Moreover, we extend our method to stroke-based style transfer with a novel
differentiable distance transform loss, which helps preserve the structure of
the input image during stroke-based stylization. Extensive experiments show our
model outperforms the existing models in both stroke-based neural painting and
stroke-based stylization. Code is available at
https://github.com/sjtuplayer/Compositional_Neural_PainterComment: ACM MM 202
La Significación de la Textura
[EN] There is no image without imagination and no shape without form. This is how Georges Didi-Huberman begins in his book When images take position. Paraphrasing Huberman, and invading the tectonic field on a detailed scale - perhaps - one might wonder if there is texture without intention and texture without meaning. The answer would be no, it does not exist. Images have meaning and images figuratively represent something. The textures are to a greater or lesser extent, full of intentions and are given meaning. As defined by Marzal in his reflections on the visual communication of the image, "texture is a visual element that simultaneously has optical and tactile qualities. This last aspect is the most outstanding, since texture is a visual element that materially sensitises and characterises the surfaces of objects."[ES] No hay imagen sin imaginación y forma sin formación, así comienza Georges Didi-Huberman en Cuando las imágenes tocan lo real. Parafraseando a Huberman, e invadiendo el ámbito tectónico a escala de detalle –quizá– cabría preguntarse si existe textura sin intención y textura sin significación. La respuesta sería no, no existe. Las imágenes significan y las imágenes representan, de manera figurativa, un algo. Las texturas están llenas de intenciones, en mayor o menor medida, y dotadas de significado. Como define Marzal en sus reflexiones sobre la comunicación visual de la imagen, "la textura es un elemento visual que posee, al tiempo, cualidades ópticas y táctiles. Este último aspecto es el más sobresaliente, ya que la textura es un elemento visual que sensibiliza y caracteriza materialmente las superficies de los objetos".García Clariana, I. (2020). The Meaning of Texture. EN BLANCO. Revista de Arquitectura. 12(28):9-11. https://doi.org/10.4995/eb.2020.13509OJS9111228Barthes, Ronland. La cámara lúcida. Barcelona: Paidós, 1989.Benjamin, Walter. Sobre la fotografía. Valencia, Pre-textos. 2004.Berger, John. Modos de ver. Barcelona, Gustavo Gili. 1974.Didi-Huberman, George. Cuando las imágenes tocan lo real. Madrid: Área de EdiciónCBA, 2013.Foster, Hal. El complejo arte-arquitectura. Madrid: Turner Publicaciones S.L., 2013.Gubern, Román. La mirada opulenta. Exploración de la iconosfera contemporánea. Barcelona: Gustavo Gili, 1994.Holl, Steven. Cuestiones de percepción, fenomenología de la arquitectura. Barcelona: Gustavo Gili, 2004.Lynch, Kevin. The image of the city. Harvard-MIT Joint Center for Urban Studies Series, 1960.Marzal Felici, José Javier. Cómo se lee una fotografía. Interpretaciones de la mirada. Madrid: Ediciones Cátedra, 2008.Zumthor, Peter. Atmósferas: entornos arquitectónicos, las cosas a mi alrededor. Barcelona: Gustavo Gili, 2006
Arbitrary topology meshes in geometric design and vector graphics
Meshes are a powerful means to represent objects and shapes both in 2D and 3D, but the techniques based on meshes can only be used in certain regular settings and restrict their usage. Meshes with an arbitrary topology have many interesting applications in geometric design and (vector) graphics, and can give designers more freedom in designing complex objects. In the first part of the thesis we look at how these meshes can be used in computer aided design to represent objects that consist of multiple regular meshes that are constructed together. Then we extend the B-spline surface technique from the regular setting to work on extraordinary regions in meshes so that multisided B-spline patches are created. In addition, we show how to render multisided objects efficiently, through using the GPU and tessellation. In the second part of the thesis we look at how the gradient mesh vector graphics primitives can be combined with procedural noise functions to create expressive but sparsely defined vector graphic images. We also look at how the gradient mesh can be extended to arbitrary topology variants. Here, we compare existing work with two new formulations of a polygonal gradient mesh. Finally we show how we can turn any image into a vector graphics image in an efficient manner. This vectorisation process automatically extracts important image features and constructs a mesh around it. This automatic pipeline is very efficient and even facilitates interactive image vectorisation
AutoGraff: towards a computational understanding of graffiti writing and related art forms.
The aim of this thesis is to develop a system that generates letters and pictures with a style that is immediately recognizable as graffiti art or calligraphy. The proposed system can be used similarly to, and in tight integration with, conventional computer-aided geometric design tools and can be used to generate synthetic graffiti content for urban environments in games and in movies, and to guide robotic or fabrication systems that can materialise the output of the system with physical drawing media. The thesis is divided into two main parts. The first part describes a set of stroke primitives, building blocks that can be combined to generate different designs that resemble graffiti or calligraphy. These primitives mimic the process typically used to design graffiti letters and exploit well known principles of motor control to model the way in which an artist moves when incrementally tracing stylised letter forms. The second part demonstrates how these stroke primitives can be automatically recovered from input geometry defined in vector form, such as the digitised traces of writing made by a user, or the glyph outlines in a font. This procedure converts the input geometry into a seed that can be transformed into a variety of calligraphic and graffiti stylisations, which depend on parametric variations of the strokes
Recommended from our members
Pictures in Your Mind: Using Interactive Gesture-Controlled Reliefs to Explore Art
Tactile reliefs offer many benefits over the more classic raised line drawings or tactile diagrams, as depth, 3D shape, and surface textures are directly perceivable. Although often created for blind and visually impaired (BVI) people, a wider range of people may benefit from such multimodal material. However, some reliefs are still difficult to understand without proper guidance or accompanying verbal descriptions, hindering autonomous exploration.
In this work, we present a gesture-controlled interactive audio guide (IAG) based on recent low-cost depth cameras that can be operated directly with the hands on relief surfaces during tactile exploration. The interactively explorable, location-dependent verbal and captioned descriptions promise rapid tactile accessibility to 2.5D spatial information in a home or education setting, to online resources, or as a kiosk installation at public places.
We present a working prototype, discuss design decisions, and present the results of two evaluation studies: the first with 13 BVI test users and the second follow-up study with 14 test users across a wide range of people with differences and difficulties associated with perception, memory, cognition, and communication. The participant-led research method of this latter study prompted new, significant and innovative developments
Toward a Perceptually-relevant Theory of Appearance
Two approaches are commonly employed in Computer Graphics to design and adjust the appearance of objects in a scene. A full 3D environment may be created, through geometrical, material and lighting modeling, then rendered using a simulation of light transport; appearance is then controlled in ways similar to photography. A radically different approach consists in providing 2D digital drawing tools to an artist, whom with enough talent and time will be able to create images of objects having the desired appearance; this is obviously strongly similar to what traditional artists do, with the computer being a mere modern drawing tool.In this document, I present research projects that have investigated a third approach, whereby pictorial elements of appearance are explicitly manipulated by an artist. On the one side, such an alternative approach offers a direct control over appearance, with novel applications in vector drawing, scientific illustration, special effects and video games. On the other side, it provides an modern method for putting our current knowledge of the perception of appearance to the test, as well as to suggest new models for human vision along the way
Software Takes Command
This book is available as open access through the Bloomsbury Open Access programme and is available on www.bloomsburycollections.com. Software has replaced a diverse array of physical, mechanical, and electronic technologies used before 21st century to create, store, distribute and interact with cultural artifacts. It has become our interface to the world, to others, to our memory and our imagination - a universal language through which the world speaks, and a universal engine on which the world runs. What electricity and combustion engine were to the early 20th century, software is to the early 21st century. Offering the the first theoretical and historical account of software for media authoring and its effects on the practice and the very concept of 'media,' the author of The Language of New Media (2001) develops his own theory for this rapidly-growing, always-changing field. What was the thinking and motivations of people who in the 1960 and 1970s created concepts and practical techniques that underlie contemporary media software such as Photoshop, Illustrator, Maya, Final Cut and After Effects? How do their interfaces and tools shape the visual aesthetics of contemporary media and design? What happens to the idea of a 'medium' after previously media-specific tools have been simulated and extended in software? Is it still meaningful to talk about different mediums at all? Lev Manovich answers these questions and supports his theoretical arguments by detailed analysis of key media applications such as Photoshop and After Effects, popular web services such as Google Earth, and the projects in motion graphics, interactive environments, graphic design and architecture. Software Takes Command is a must for all practicing designers and media artists and scholars concerned with contemporary media