521 research outputs found

    A new mesh visual quality metric using saliency weighting-based pooling strategy

    Full text link
    © 2018 Elsevier Inc. Several metrics have been proposed to assess the visual quality of 3D triangular meshes during the last decade. In this paper, we propose a mesh visual quality metric by integrating mesh saliency into mesh visual quality assessment. We use the Tensor-based Perceptual Distance Measure metric to estimate the local distortions for the mesh, and pool local distortions into a quality score using a saliency weighting-based pooling strategy. Three well-known mesh saliency detection methods are used to demonstrate the superiority and effectiveness of our metric. Experimental results show that our metric with any of three saliency maps performs better than state-of-the-art metrics on the LIRIS/EPFL general-purpose database. We generate a synthetic saliency map by assembling salient regions from individual saliency maps. Experimental results reveal that the synthetic saliency map achieves better performance than individual saliency maps, and the performance gain is closely correlated with the similarity between the individual saliency maps

    Visual saliency guided textured model simplification

    Get PDF
    Mesh geometry can be used to model both object shape and details. If texture maps are involved, it is common to let mesh geometry mainly model object shapes and let the texture maps model the most object details, optimising data size and complexity of an object. To support efficient object rendering and transmission, model simplification can be applied to reduce the modelling data. However, existing methods do not well consider how object features are jointly represented by mesh geometry and texture maps, having problems in identifying and preserving important features for simplified objects. To address this, we propose a visual saliency detection method for simplifying textured 3D models. We produce good simplification results by jointly processing mesh geometry and texture map to produce a unified saliency map for identifying visually important object features. Results show that our method offers a better object rendering quality than existing methods

    Research on perceptual quality metrics of 3D meshes

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.In mesh processing operations, the 3D mesh is always subject to geometric distortions which may degrade the visual quality of the 3D mesh. As the ultimate receptor of 3D meshes is typically human eyes, it is of great significance to develop perceptual metrics for mesh quality assessment. While most existing research works in mesh quality assessment focused on the evaluation of local distortions on the mesh, we mainly study the perceptual importance of local regions to the mesh quality in this thesis. We propose five perceptual mesh quality metrics and validate the performance superiority of these metrics over state-of-the-art metrics in the experiments. Firstly, we investigate the relationship between the statistical descriptors of local distortions and the mesh quality. We propose the Tensor-based Perceptual Distance Measure-Spatial Pooling metric by exploiting the Support Vector Regression method. The experimental results demonstrate that the severely distorted regions on the mesh have a great impact on the mesh quality. Secondly, we propose the percentile weighting method to emphasize the impact of the severely distorted regions on the mesh quality. A local distortion-based visual importance model is built by assigning a great weight to a small portion of vertices that suffer from severe distortions. We propose the Tensor-based Perceptual Distance Measure-Percentile Weighting metric by applying the percentile weighting method at the stage of spatial pooling. Thirdly, we investigate the incorporation of mesh saliency into mesh quality assessment and build a visual importance model to emphasize the impact of salient regions on the mesh quality. We propose the Tensor-based Perceptual Distance Measure-Visual Saliency metric by taking the saliency value as the weighting coefficient of the local distortion. A saliency map synthesis method is proposed to assemble the salient regions from different individual saliency maps. The experimental results reveal that the performance gain of the synthetic saliency map over individual saliency maps depends on the similarity between individual saliency maps. Lastly, we propose the saliency weighting strategy to emphasize the impact of both severely distorted regions and salient regions on the mesh quality. We propose the Tensor-based Perceptual Distance Measure-Saliency Weighting metric and the Tensor-based Perceptual Distance Measure-Percentile Weighting-Saliency Weighting metric by applying the Minkowski method and the percentile weighting method respectively to the saliency-aware distortions. The experimental results reveal that the emphasis on the impact of salient regions improves the performance of the metric, but an overemphasis may cause a performance degradation

    Semantic Communications with Variable-Length Coding for Extended Reality

    Full text link
    Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded in semantic coders, significantly improving the transmission efficiency. Besides, considering the fact that some source content may have less amount of semantic information or have higher tolerance to channel noise, we propose a universal variable-length semantic-channel coding method. In particular, we first use a rate allocation network to estimate the best code length for semantic information and then adjust the coding process accordingly. By adopting some proxy functions, the whole framework is trained in an end-to-end manner. Numerical results show that our semantic system significantly outperforms traditional transmission methods and the proposed variable-length coding scheme is superior to the fixed-length coding methods.Comment: 1. Update the performance of VL-SCC in Fig8. under new rate allocation architecture 2. Give a fair comparison between VL-SCC and SCC in Fig9. 3. fix the typo of LDPC rate (1/3 changed to 2/3) 4. Reduce L=32 to 16, and update the bp

    Aggressive saliency-aware point cloud compression

    Full text link
    The increasing demand for accurate representations of 3D scenes, combined with immersive technologies has led point clouds to extensive popularity. However, quality point clouds require a large amount of data and therefore the need for compression methods is imperative. In this paper, we present a novel, geometry-based, end-to-end compression scheme, that combines information on the geometrical features of the point cloud and the user's position, achieving remarkable results for aggressive compression schemes demanding very small bit rates. After separating visible and non-visible points, four saliency maps are calculated, utilizing the point cloud's geometry and distance from the user, the visibility information, and the user's focus point. A combination of these maps results in a final saliency map, indicating the overall significance of each point and therefore quantizing different regions with a different number of bits during the encoding process. The decoder reconstructs the point cloud making use of delta coordinates and solving a sparse linear system. Evaluation studies and comparisons with the geometry-based point cloud compression (G-PCC) algorithm by the Moving Picture Experts Group (MPEG), carried out for a variety of point clouds, demonstrate that the proposed method achieves significantly better results for small bit rates

    End-to-end deep multi-score model for No-reference stereoscopic image quality assessment

    Full text link
    Deep learning-based quality metrics have recently given significant improvement in Image Quality Assessment (IQA). In the field of stereoscopic vision, information is evenly distributed with slight disparity to the left and right eyes. However, due to asymmetric distortion, the objective quality ratings for the left and right images would differ, necessitating the learning of unique quality indicators for each view. Unlike existing stereoscopic IQA measures which focus mainly on estimating a global human score, we suggest incorporating left, right, and stereoscopic objective scores to extract the corresponding properties of each view, and so forth estimating stereoscopic image quality without reference. Therefore, we use a deep multi-score Convolutional Neural Network (CNN). Our model has been trained to perform four tasks: First, predict the left view's quality. Second, predict the quality of the left view. Third and fourth, predict the quality of the stereo view and global quality, respectively, with the global score serving as the ultimate quality. Experiments are conducted on Waterloo IVC 3D Phase 1 and Phase 2 databases. The results obtained show the superiority of our method when comparing with those of the state-of-the-art. The implementation code can be found at: https://github.com/o-messai/multi-score-SIQ

    Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts

    Get PDF
    Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
    • …
    corecore