117 research outputs found

    Modelling Visual Objects Regardless of Depictive Style

    Get PDF

    How sketches work: a cognitive theory for improved system design

    Get PDF
    Evidence is presented that in the early stages of design or composition the mental processes used by artists for visual invention require a different type of support from those used for visualising a nearly complete object. Most research into machine visualisation has as its goal the production of realistic images which simulate the light pattern presented to the retina by real objects. In contrast sketch attributes preserve the results of cognitive processing which can be used interactively to amplify visual thought. The traditional attributes of sketches include many types of indeterminacy which may reflect the artist's need to be "vague". Drawing on contemporary theories of visual cognition and neuroscience this study discusses in detail the evidence for the following functions which are better served by rough sketches than by the very realistic imagery favoured in machine visualising systems. 1. Sketches are intermediate representational types which facilitate the mental translation between descriptive and depictive modes of representing visual thought. 2. Sketch attributes exploit automatic processes of perceptual retrieval and object recognition to improve the availability of tacit knowledge for visual invention. 3. Sketches are percept-image hybrids. The incomplete physical attributes of sketches elicit and stabilise a stream of super-imposed mental images which amplify inventive thought. 4. By segregating and isolating meaningful components of visual experience, sketches may assist the user to attend selectively to a limited part of a visual task, freeing otherwise over-loaded cognitive resources for visual thought. 5. Sequences of sketches and sketching acts support the short term episodic memory for cognitive actions. This assists creativity, providing voluntary control over highly practised mental processes which can otherwise become stereotyped. An attempt is made to unite the five hypothetical functions. Drawing on the Baddeley and Hitch model of working memory, it is speculated that the five functions may be related to a limited capacity monitoring mechanism which makes tacit visual knowledge explicitly available for conscious control and manipulation. It is suggested that the resources available to the human brain for imagining nonexistent objects are a cultural adaptation of visual mechanisms which evolved in early hominids for responding to confusing or incomplete stimuli from immediately present objects and events. Sketches are cultural inventions which artificially mimic aspects of such stimuli in order to capture these shared resources for the different purpose of imagining objects which do not yet exist. Finally the implications of the theory for the design of improved machine systems is discussed. The untidy attributes of traditional sketches are revealed to include cultural inventions which serve subtle cognitive functions. However traditional media have many short-comings which it should be possible to correct with new technology. Existing machine systems for sketching tend to imitate nonselectively the media bound properties of sketches without regard to the functions they serve. This may prove to be a mistake. It is concluded that new system designs are needed in which meaningfully structured data and specialised imagery amplify without interference or replacement the impressive but limited creative resources of the visual brain

    Learning graphs to model visual objects across different depictive styles

    Get PDF
    Abstract. Visual object classification and detection are major prob-lems in contemporary computer vision. State-of-art algorithms allow t-housands of visual objects to be learned and recognized, under a wide range of variations including lighting changes, occlusion, point of view and different object instances. Only a small fraction of the literature ad-dresses the problem of variation in depictive styles (photographs, draw-ings, paintings etc.). This is a challenging gap but the ability to process images of all depictive styles and not just photographs has potential val-ue across many applications. In this paper we model visual classes using a graph with multiple labels on each node; weights on arcs and nodes indicate relative importance (salience) to the object description. Visual class models can be learned from examples from a database that contains photographs, drawings, paintings etc. Experiments show that our repre-sentation is able to improve upon Deformable Part Models for detection and Bag of Words models for classification

    Hierarchical Image Descriptions for Classification and Painting

    Get PDF
    The overall argument this thesis makes is that topological object structures captured within hierarchical image descriptions are invariant to depictive styles and offer a level of abstraction found in many modern abstract artworks. To show how object structures can be extracted from images, two hierarchical image descriptions are proposed. The first of these is inspired by perceptual organisation; whereas, the second is based on agglomerative clustering of image primitives. This thesis argues the benefits and drawbacks of each image description and empirically show why the second is more suitable in capturing object strucutures. The value of graph theory is demonstrated in extracting object structures, especially from the second type of image description. User interaction during the structure extraction process is also made possible via an image hierarchy editor. Two applications of object structures are studied in depth. On the computer vision side, the problem of object classification is investigated. In particular, this thesis shows that it is possible to classify objects regardless of their depictive styles. This classification problem is approached using a graph theoretic paradigm; by encoding object structures as feature vectors of fixed lengths, object classification can then be treated as a clustering problem in structural feature space and that actual clustering can be done using conventional machine learning techniques. The benefits of object structures in computer graphics are demonstrated from a Non-Photorealistic Rendering (NPR) point of view. In particular, it is shown that topological object structures deliver an appropriate degree of abstraction that often appears in well-known abstract artworks. Moreover, the value of shape simplification is demonstrated in the process of making abstract art. By integrating object structures and simple geometric shapes, it is shown that artworks produced in child-like paintings and from artists such as Wassily Kandinsky, Joan Miro and Henri Matisse can be synthesised and by doing so, the current gamut of NPR styles is extended. The whole process of making abstract art is built into a single piece of software with intuitive GUI.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Cross-depiction problem: Recognition and Synthesis of Photographs and Artwork

    Get PDF
    Cross-depiction is the recognition—and synthesis—of objects whether they are photographed, painted, drawn, etc. It is a significant yet underresearched problem. Emulating the remarkable human ability to recognise and depict objects in an astonishingly wide variety of depictive forms is likely to advance both the foundations and the applications of computer vision. In this paper we motivate the cross-depiction problem, explain why it is difficult, and discuss some current approaches. Our main conclusions are (i) appearance-based recognition systems tend to be over-fitted to one depiction, (ii) models that explicitly encode spatial relations between parts are more robust, and (iii) recognition and non-photorealistic synthesis are related tasks.Peter Hall, Hongping Cai, Qi Wu and Tadeo Corrad

    Context-aware gestural interaction in the smart environments of the ubiquitous computing era

    Get PDF
    A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirements for the degree of Doctor of PhilosophyTechnology is becoming pervasive and the current interfaces are not adequate for the interaction with the smart environments of the ubiquitous computing era. Recently, researchers have started to address this issue introducing the concept of natural user interface, which is mainly based on gestural interactions. Many issues are still open in this emerging domain and, in particular, there is a lack of common guidelines for coherent implementation of gestural interfaces. This research investigates gestural interactions between humans and smart environments. It proposes a novel framework for the high-level organization of the context information. The framework is conceived to provide the support for a novel approach using functional gestures to reduce the gesture ambiguity and the number of gestures in taxonomies and improve the usability. In order to validate this framework, a proof-of-concept has been developed. A prototype has been developed by implementing a novel method for the view-invariant recognition of deictic and dynamic gestures. Tests have been conducted to assess the gesture recognition accuracy and the usability of the interfaces developed following the proposed framework. The results show that the method provides optimal gesture recognition from very different view-points whilst the usability tests have yielded high scores. Further investigation on the context information has been performed tackling the problem of user status. It is intended as human activity and a technique based on an innovative application of electromyography is proposed. The tests show that the proposed technique has achieved good activity recognition accuracy. The context is treated also as system status. In ubiquitous computing, the system can adopt different paradigms: wearable, environmental and pervasive. A novel paradigm, called synergistic paradigm, is presented combining the advantages of the wearable and environmental paradigms. Moreover, it augments the interaction possibilities of the user and ensures better gesture recognition accuracy than with the other paradigms
    • …
    corecore