Search CORE

858 research outputs found

Monocular visual scene analysis:saliency detection and 3D face reconstruction using GAN

Author: Cai Xiaoxu
Publication venue
Publication date: 01/01/2021
Field of study

Portsmouth University Research Portal (Pure)

Information theory tools for viewpoint selection, mesh saliency and geometry simplification

Author: Castelló Boscá Pascual
Chover Miguel
Feixas Miguel
Sbert Casasayas Mateu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

In this chapter we review the use of an information channel as a unified framework for viewpoint selection, mesh saliency and geometry simplification. Taking the viewpoint distribution as input and object mesh polygons as output vectors, the channel is given by the projected areas of the polygons over the different viewpoints. From this channel, viewpoint entropy and viewpoint mutual information can be defined in a natural way. Reversing this channel, polygonal mutual information is obtained, which is interpreted as an ambient occlusion-like quantity, and from the variation of this polygonal mutual information mesh saliency is defined. Viewpoint entropy, viewpoint Kullback-Leibler distance, and viewpoint mutual information are then applied to mesh simplification, and shown to compare well with a classical geometrical simplification method

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Generating detailed saliency maps using model-agnostic methods

Author: Sakowicz Maciej
Publication venue
Publication date: 04/09/2022
Field of study

The emerging field of Explainable Artificial Intelligence focuses on researching methods of explaining the decision making processes of complex machine learning models. In the field of explainability for Computer Vision, explanations are provided as saliency maps, which visualize the importance of individual pixels of the input w.r.t. the model's prediction. In this work we focus on a perturbation-based, model-agnostic explainability method called RISE, elaborate on observed shortcomings of its grid-based approach and propose two modifications: replacement of square occlusions with convex polygonal occlusions based on cells of a Voronoi mesh and addition of an informativeness guarantee to the occlusion mask generator. These modifications, collectively called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy of maps generated using large occlusions and accelerate convergence of saliency maps in cases where sampling density is either very low or very high. We perform a quantitative comparison of accuracy of saliency maps produced by VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided content insertion/deletion metric and a localization metric based on bounding boxes. Additionally, we explore the space of configurable occlusion pattern parameters to better understand their influence on saliency maps produced by RISE and VRISE. We also describe and demonstrate two effects observed over the course of experimentation, arising from the random sampling approach of RISE: "feature slicing" and "saliency misattribution". Our results show that convex polygonal occlusions yield more accurate maps for coarse occlusion meshes and multi-object images, but improvement is not guaranteed in other cases. The informativeness guarantee is shown to increase the convergence rate without incurring a significant computational overhead.Comment: 85 pages, 70 figures, Master's thesis, defended on 2021-12-23 (Gdansk University of Technology

arXiv.org e-Print Archive

Intelligent visual media processing: when graphics meets vision

Author: Cheng Ming-Ming
Hou Qi-Bin
Rosin Paul L
Zhang Song-Hai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The computer graphics and computer vision communities have been working closely together in recent years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media around us. There are three major driving forces behind this phenomenon: i) the availability of big data from the Internet has created a demand for dealing with the ever increasing, vast amount of resources; ii) powerful processing tools, such as deep neural networks, provide e�ective ways for learning how to deal with heterogeneous visual data; iii) new data capture devices, such as the Kinect, bridge between algorithms for 2D image understanding and 3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey recent research on how computer vision techniques bene�t computer graphics techniques and vice versa, and cover research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest possible further research directions

Online Research @ Cardiff

Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts

Author: Power Luther
Publication venue: Lancaster University
Publication date: 01/01/2020
Field of study

Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people

Lancaster E-Prints