58 research outputs found

    Visual saliency guided textured model simplification

    Get PDF
    Mesh geometry can be used to model both object shape and details. If texture maps are involved, it is common to let mesh geometry mainly model object shapes and let the texture maps model the most object details, optimising data size and complexity of an object. To support efficient object rendering and transmission, model simplification can be applied to reduce the modelling data. However, existing methods do not well consider how object features are jointly represented by mesh geometry and texture maps, having problems in identifying and preserving important features for simplified objects. To address this, we propose a visual saliency detection method for simplifying textured 3D models. We produce good simplification results by jointly processing mesh geometry and texture map to produce a unified saliency map for identifying visually important object features. Results show that our method offers a better object rendering quality than existing methods

    AutoGraff: towards a computational understanding of graffiti writing and related art forms.

    Get PDF
    The aim of this thesis is to develop a system that generates letters and pictures with a style that is immediately recognizable as graffiti art or calligraphy. The proposed system can be used similarly to, and in tight integration with, conventional computer-aided geometric design tools and can be used to generate synthetic graffiti content for urban environments in games and in movies, and to guide robotic or fabrication systems that can materialise the output of the system with physical drawing media. The thesis is divided into two main parts. The first part describes a set of stroke primitives, building blocks that can be combined to generate different designs that resemble graffiti or calligraphy. These primitives mimic the process typically used to design graffiti letters and exploit well known principles of motor control to model the way in which an artist moves when incrementally tracing stylised letter forms. The second part demonstrates how these stroke primitives can be automatically recovered from input geometry defined in vector form, such as the digitised traces of writing made by a user, or the glyph outlines in a font. This procedure converts the input geometry into a seed that can be transformed into a variety of calligraphic and graffiti stylisations, which depend on parametric variations of the strokes

    Part decomposition of 3D surfaces

    Get PDF
    This dissertation describes a general algorithm that automatically decomposes realworld scenes and objects into visual parts. The input to the algorithm is a 3 D triangle mesh that approximates the surfaces of a scene or object. This geometric mesh completely specifies the shape of interest. The output of the algorithm is a set of boundary contours that dissect the mesh into parts where these parts agree with human perception. In this algorithm, shape alone defines the location of a bom1dary contour for a part. The algorithm leverages a human vision theory known as the minima rule that states that human visual perception tends to decompose shapes into parts along lines of negative curvature minima. Specifically, the minima rule governs the location of part boundaries, and as a result the algorithm is known as the Minima Rule Algorithm. Previous computer vision methods have attempted to implement this rule but have used pseudo measures of surface curvature. Thus, these prior methods are not true implementations of the rule. The Minima Rule Algorithm is a three step process that consists of curvature estimation, mesh segmentation, and quality evaluation. These steps have led to three novel algorithms known as Normal Vector Voting, Fast Marching Watersheds, and Part Saliency Metric, respectively. For each algorithm, this dissertation presents both the supporting theory and experimental results. The results demonstrate the effectiveness of the algorithm using both synthetic and real data and include comparisons with previous methods from the research literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art

    THE IMAGE TORQUE OPERATOR FOR MID-LEVEL VISION: THEORY AND EXPERIMENT

    Get PDF
    A problem central to visual scene understanding and computer vision is to extract semantically meaningful parts of images. A visual scene consists of objects, and the objects and parts of objects are delineated from their surrounding by closed contours. In this thesis a new bottom-up visual operator, called the Torque operator, which captures the concept of closed contours is introduced. Its computation is inspired by the mechanical definition of torque or moment of force, and applied to image edges. It takes as input edges and computes over regions of different size a measure of how well the edges are aligned to form a closed, convex contour. The torque operator is by definition scale independent, and can be seen as an operator of mid-level vision that captures the organizational concept of 'closure' and grouping mechanism of edges. In this thesis, fundamental properties of the torque measure are studied, and experiments are performed to demonstrate and verify that it can be made a useful tool for a variety of applications, including visual attention, segmentation, and boundary edge detection

    Apparent ridges for line drawing

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 69-72).Non-photorealistic line drawing depicts 3D shapes through the rendering of feature lines. A number of characterizations of relevant lines have been proposed but none of these definitions alone seem to capture all visually-relevant lines. We introduce a new definition of feature lines based on two perceptual observations. First, human perception is sensitive to the variation of shading, and since shape perception is little affected by lighting and reflectance modification, we should focus on normal variation. Second, view-dependent lines better convey the shape of smooth surfaces better than view-independent lines. From this we define view-dependent curvature as the variation of the surface normal with respect to a viewing screen plane, and apparent ridges as the locus points of the maximum of the view-dependent curvature. We derive the equation for apparent ridges and present a new algorithm to render line drawings of 3D meshes. We show that our apparent ridges encompass or enhance aspects of several other feature lines.by Tilke Judd.S.M

    Trademark image retrieval by local features

    Get PDF
    The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current operational trademark retrieval systems involve manual annotation of the images (the current ‘gold standard’). Accordingly, current systems require a substantial amount of time and labour to access, and are therefore expensive to operate. This thesis focuses on the development of algorithms that mimic aspects of human visual perception in order to retrieve similar abstract trademark images automatically. A significant category of trademark images are typically highly stylised, comprising a collection of distinctive graphical elements that often include geometric shapes. Therefore, in order to compare the similarity of such images the principal aim of this research has been to develop a method for solving the partial matching and shape perception problem. There are few useful techniques for partial shape matching in the context of trademark retrieval, because those existing techniques tend not to support multicomponent retrieval. When this work was initiated most trademark image retrieval systems represented images by means of global features, which are not suited to solving the partial matching problem. Instead, the author has investigated the use of local image features as a means to finding similarities between trademark images that only partially match in terms of their subcomponents. During the course of this work, it has been established that the Harris and Chabat detectors could potentially perform sufficiently well to serve as the basis for local feature extraction in trademark image retrieval. Early findings in this investigation indicated that the well established SIFT (Scale Invariant Feature Transform) local features, based on the Harris detector, could potentially serve as an adequate underlying local representation for matching trademark images. There are few researchers who have used mechanisms based on human perception for trademark image retrieval, implying that the shape representations utilised in the past to solve this problem do not necessarily reflect the shapes contained in these image, as characterised by human perception. In response, a ii practical approach to trademark image retrieval by perceptual grouping has been developed based on defining meta-features that are calculated from the spatial configurations of SIFT local image features. This new technique measures certain visual properties of the appearance of images containing multiple graphical elements and supports perceptual grouping by exploiting the non-accidental properties of their configuration. Our validation experiments indicated that we were indeed able to capture and quantify the differences in the global arrangement of sub-components evident when comparing stylised images in terms of their visual appearance properties. Such visual appearance properties, measured using 17 of the proposed metafeatures, include relative sub-component proximity, similarity, rotation and symmetry. Similar work on meta-features, based on the above Gestalt proximity, similarity, and simplicity groupings of local features, had not been reported in the current computer vision literature at the time of undertaking this work. We decided to adopted relevance feedback to allow the visual appearance properties of relevant and non-relevant images returned in response to a query to be determined by example. Since limited training data is available when constructing a relevance classifier by means of user supplied relevance feedback, the intrinsically non-parametric machine learning algorithm ID3 (Iterative Dichotomiser 3) was selected to construct decision trees by means of dynamic rule induction. We believe that the above approach to capturing high-level visual concepts, encoded by means of meta-features specified by example through relevance feedback and decision tree classification, to support flexible trademark image retrieval and to be wholly novel. The retrieval performance the above system was compared with two other state-of-the-art image trademark retrieval systems: Artisan developed by Eakins (Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using relevance feedback, our system achieves higher average normalised precision than either of the systems developed by Eakins’ or Jiang. However, while our trademark image query and database set is based on an image dataset used by Eakins, we employed different numbers of images. It was not possible to access to the same query set and image database used in the evaluation of Jiang’s trademark iii image retrieval system evaluation. Despite these differences in evaluation methodology, our approach would appear to have the potential to improve retrieval effectiveness

    Metric representations for shape analysis and synthesis

    Get PDF
    2D and 3D geometric shapes are ubiquitous in computer graphics, computer animation, and computer-aided design and manufacturing. Two of the fundamental research challenges that underline these applications are the analysis and synthesis of shapes, with the former aiming to extract semantically meaningful knowledge of shapes and the latter focusing on generating plausible-looking shapes based on user inputs. Traditionally, shape analysis and synthesis are based on representations such as meshes, parameterisations, and Laplacians, which lead to mostly hand-crafted computation rules that are either suboptimal or treat related tasks separately. In this work, we propose to represent a 2D/3D shape as a square symmetric matrix that correlates every pair of geometric points on the shape, which allows us to formulate shape analysis and synthesis problems as principled optimisation problems that can be globally optimised. To demonstrate the usefulness of our new metric representation for shape analysis, we first address 3D mesh saliency detection by representing a shape as a pairwise feature distance matrix, whose principal eigenvector is experimentally shown to outperform the traditional saliency detection rules for capturing ground truth saliency annotations. Following this work, we then unify saliency detection and nonrigid shape matching via a jointly learned metric representation, which is shown to improve the accuracy of both tasks on the existing saliency detection and shape matching benchmarks. To also demonstrate the usefulness of our metric representation for shape synthesis, we address 2D facial shape beautification in images by representing a facial shape as the orthogonal projection matrix onto 2D facial landmarks, which is shown to improve the attractiveness of both frontal-neutral and non-frontal-non-neutral faces in the user studies. Finally, we show that adversarially learning the distributions of human shapes and poses in a hidden space produces higher quality human samples than in the geometry space. Together, these results show that our metric representation benefits both the analysis and synthesis of shapes, with the potential of unifying more diverse tasks such as part segmentation and labelling in the future work

    Biolocomotion Detection in Videos

    Get PDF
    Animals locomote for various reasons: to search for food, to find suitable habitat, to pursue prey, to escape from predators, or to seek a mate. The grand scale of biodiversity contributes to the great locomotory design and mode diversity. In this dissertation, the locomotion of general biological species is referred to as biolocomotion. The goal of this dissertation is to develop a computational approach to detect biolocomotion in any unprocessed video. The ways biological entities locomote through an environment are extremely diverse. Various creatures make use of legs, wings, fins, and other means to move through the world. Significantly, the motion exhibited by the body parts to navigate through an environment can be modelled by a combination of an overall positional advance with an overlaid asymmetric oscillatory pattern, a distinctive signature that tends to be absent in non-biological objects in locomotion. In this dissertation, this key trait of positional advance with asymmetric oscillation along with differences in an object's common motion (extrinsic motion) and localized motion of its parts (intrinsic motion) is exploited to detect biolocomotion. In particular, a computational algorithm is developed to measure the presence of these traits in tracked objects to determine if they correspond to a biological entity in locomotion. An alternative algorithm, based on generic handcrafted features combined with learning is assembled out of components from allied areas of investigation, also is presented as a basis of comparison to the main proposed algorithm. A novel biolocomotion dataset encompassing a wide range of moving biological and non-biological objects in natural settings is provided. Additionally, biolocomotion annotations to an extant camouflage animals dataset also is provided. Quantitative results indicate that the proposed algorithm considerably outperforms the alternative approach, supporting the hypothesis that biolocomotion can be detected reliably based on its distinct signature of positional advance with asymmetric oscillation and extrinsic/intrinsic motion dissimilarity
    • …
    corecore