17 research outputs found

    Understanding and controlling contrast oscillations in stochastic texture algorithms using Spectrum of Variance

    Get PDF
    We identify and analyze a major issue pertaining to all power-spectrum based texture synthesis algorithms – from Fourier synthesis to procedural noise algorithms like Perlin or Gabor noise – , namely, the oscillation of contrast (see Figures 1,2,3,7). One of our key contributions is to introduce a simple yet powerful descriptor of signals, the Spectrum of Variance (not to be confused with the PSD), which, to our surprise, has never been leveraged before. In this new framework, several issues get easy to understand measure and control, with new handles, as we illustrate. We finally show that fixing oscillation of contrast opens many doors to a more controllable authoring of stochastic texturing. We explore some of the new reachable possibilities such as constrained noise content and bridges towards very different families of look such as cellular patterns, points-like distributions or reaction-diffusion

    A Natural Image Pointillism with Controlled Ellipse Dots

    Get PDF
    This paper presents an image-based artistic rendering algorithm for the automatic Pointillism style. At first, ellipse dot locations are randomly generated based on a source image; then dot orientations are precalculated with help of a direction map; a saliency map of the source image decides long and short radius of the ellipse dot. At last, the rendering runs layer-by-layer from large size dots to small size dots so as to reserve the detailed parts of the image. Although only ellipse dot shape is adopted, the final Pointillism style performs well because of variable characteristics of the dot

    Emerging images

    Get PDF
    Figure 1: This image, when stared at for a while, can reveal four instances of a familiar figure. Two of the figures are easier to detect than the others. Locally there is little meaningful information, and we perceive the figures only when observing the whole figures. Emergence refers to the unique human ability to aggregate information from seemingly meaningless pieces, and to perceive a whole that is meaningful. This special skill of humans can constitute an effective scheme to tell humans and machines apart. This paper presents a synthesis technique to generate images of 3D objects that are detectable by humans, but difficult for an automatic algorithm to recognize. The technique allows generating an infinite number of images with emerging figures. Our algorithm is designed so that locally the synthesized images divulge little useful information or cues to assist any segmentation or recognition procedure. Therefore, as we demonstrate, computer vision algorithms are incapable of effectively processing such images. However, when a human observer is presented with an emergence image, synthesized using an object she is familiar with, the figure emerges when observed as a whole. We can control the difficulty level of perceiving the emergence effect through a limited set of parameters. A procedure that synthesizes emergence images can be an effective tool for exploring and understanding the factors affecting computer vision techniques.

    Integrated multi-scale architecture of the cortex with application to computer vision

    Get PDF
    Tese de dout., Engenharia Electrónica e de Computadores, Faculdade de Ciência e Tecnologia, Universidade do Algarve, 2007The main goal of this thesis is to try to understand the functioning of the visual cortex through the development of computational models. In the input layer V1 of the visual cortex there are simple, complex and endstopped cells. These provide a multi-scale representation of objects and scene in terms of lines, edges and keypoints. In this thesis we combine recent progress concerning the development of computational models of these and other cells with processes in higher cortical areas V2 and V4 etc. Three pertinent challenges are discussed: (i) object recognition embedded in a cortical architecture; (ii) brightness perception, and (iii) painterly rendering based on human vision. Specific aspects are Focusof- Attention by means of keypoint-based saliency maps, the dynamic routing of features from V1 through higher cortical areas in order to obtain translation, rotation and size invariance, and the construction of normalized object templates with canonical views in visual memory. Our simulations show that the multi-scale representations can be integrated into a cortical architecture in order to model subsequent processing steps: from segregation, via different categorization levels, until final object recognition is obtained. As for real cortical processing, the system starts with coarse-scale information, refines categorization by using mediumscale information, and employs all scales in recognition. We also show that a 2D brightness model can be based on the multi-scale symbolic representation of lines and edges, with an additional low-pass channel and nonlinear amplitude transfer functions, such that object recognition and brightness perception are combined processes based on the same information. The brightness model can predict many different effects such as Mach bands, grating induction, the Craik-O’Brien-Cornsweet illusion and brightness induction, i.e. the opposite effects of assimilation (White effect) and simultaneous brightness contrast. Finally, a novel application is introduced: painterly rendering has been linked to computer vision, but we propose to link it to human vision because perception and painting are two processes which are strongly interwoven

    Hierarchical Image Descriptions for Classification and Painting

    Get PDF
    The overall argument this thesis makes is that topological object structures captured within hierarchical image descriptions are invariant to depictive styles and offer a level of abstraction found in many modern abstract artworks. To show how object structures can be extracted from images, two hierarchical image descriptions are proposed. The first of these is inspired by perceptual organisation; whereas, the second is based on agglomerative clustering of image primitives. This thesis argues the benefits and drawbacks of each image description and empirically show why the second is more suitable in capturing object strucutures. The value of graph theory is demonstrated in extracting object structures, especially from the second type of image description. User interaction during the structure extraction process is also made possible via an image hierarchy editor. Two applications of object structures are studied in depth. On the computer vision side, the problem of object classification is investigated. In particular, this thesis shows that it is possible to classify objects regardless of their depictive styles. This classification problem is approached using a graph theoretic paradigm; by encoding object structures as feature vectors of fixed lengths, object classification can then be treated as a clustering problem in structural feature space and that actual clustering can be done using conventional machine learning techniques. The benefits of object structures in computer graphics are demonstrated from a Non-Photorealistic Rendering (NPR) point of view. In particular, it is shown that topological object structures deliver an appropriate degree of abstraction that often appears in well-known abstract artworks. Moreover, the value of shape simplification is demonstrated in the process of making abstract art. By integrating object structures and simple geometric shapes, it is shown that artworks produced in child-like paintings and from artists such as Wassily Kandinsky, Joan Miro and Henri Matisse can be synthesised and by doing so, the current gamut of NPR styles is extended. The whole process of making abstract art is built into a single piece of software with intuitive GUI.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Text–to–Video: Image Semantics and NLP

    Get PDF
    When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

    Artistic Content Representation and Modelling based on Visual Style Features

    Get PDF
    This thesis aims to understand visual style in the context of computer science, using traditionally intangible artistic properties to enhance existing content manipulation algorithms and develop new content creation methods. The developed algorithms can be used to apply extracted properties to other drawings automatically; transfer a selected style; categorise images based upon perceived style; build 3D models using style features from concept artwork; and other style-based actions that change our perception of an object without changing our ability to recognise it. The research in this thesis aims to provide the style manipulation abilities that are missing from modern digital art creation pipelines
    corecore