185 research outputs found

    A domain adaptive deep learning solution for scanpath prediction of paintings

    Full text link
    Cultural heritage understanding and preservation is an important issue for society as it represents a fundamental aspect of its identity. Paintings represent a significant part of cultural heritage, and are the subject of study continuously. However, the way viewers perceive paintings is strictly related to the so-called HVS (Human Vision System) behaviour. This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. In further details, we introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans, including the fundamental understanding of a scene, and then extend it to painting images. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention. We use an FCNN (Fully Convolutional Neural Network), in which we exploit a differentiable channel-wise selection and Soft-Argmax modules. We also incorporate learnable Gaussian distributions onto the network bottleneck to simulate visual attention process bias in natural scene images. Furthermore, to reduce the effect of shifts between different domains (i.e. natural images, painting), we urge the model to learn unsupervised general features from other domains using a gradient reversal classifier. The results obtained by our model outperform existing state-of-the-art ones in terms of accuracy and efficiency.Comment: Accepted at CBMI2022 graz, austri

    End-to-end deep multi-score model for No-reference stereoscopic image quality assessment

    Full text link
    Deep learning-based quality metrics have recently given significant improvement in Image Quality Assessment (IQA). In the field of stereoscopic vision, information is evenly distributed with slight disparity to the left and right eyes. However, due to asymmetric distortion, the objective quality ratings for the left and right images would differ, necessitating the learning of unique quality indicators for each view. Unlike existing stereoscopic IQA measures which focus mainly on estimating a global human score, we suggest incorporating left, right, and stereoscopic objective scores to extract the corresponding properties of each view, and so forth estimating stereoscopic image quality without reference. Therefore, we use a deep multi-score Convolutional Neural Network (CNN). Our model has been trained to perform four tasks: First, predict the left view's quality. Second, predict the quality of the left view. Third and fourth, predict the quality of the stereo view and global quality, respectively, with the global score serving as the ultimate quality. Experiments are conducted on Waterloo IVC 3D Phase 1 and Phase 2 databases. The results obtained show the superiority of our method when comparing with those of the state-of-the-art. The implementation code can be found at: https://github.com/o-messai/multi-score-SIQ

    Perceptual Quality Evaluation of 3D Triangle Mesh: A Technical Review

    Full text link
    © 2018 IEEE. During mesh processing operations (e.g. simplifications, compression, and watermarking), a 3D triangle mesh is subject to various visible distortions on mesh surface which result in a need to estimate visual quality. The necessity of perceptual quality evaluation is already established, as in most cases, human beings are the end users of 3D meshes. To measure such kinds of distortions, the metrics that consider geometric measures integrating human visual system (HVS) is called perceptual quality metrics. In this paper, we direct an expansive study on 3D mesh quality evaluation mostly focusing on recently proposed perceptual based metrics. We limit our study on greyscale static mesh evaluation and attempt to figure out the most workable method for real-Time evaluation by making a quantitative comparison. This paper also discusses in detail how to evaluate objective metric's performance with existing subjective databases. In this work, we likewise research the utilization of the psychometric function to expel non-linearity between subjective and objective values. Finally, we draw a comparison among some selected quality metrics and it shows that curvature tensor based quality metrics predicts consistent result in terms of correlation

    Machine Learning for Multimedia Communications

    Get PDF
    Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise

    Supervised Deep Learning for Content-Aware Image Retargeting with Fourier Convolutions

    Full text link
    Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and cropping the original images. A second challenge is generating different image sizes in inference time. However, regular convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model's ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.Comment: 18 pages, 5 figure

    An Image-based model for 3D shape quality measure

    Get PDF
    In light of increased research on 3D shapes and the increased processing capability of GPUs, there has been a significant increase in available 3D applications. In many applications, assessment of perceptual quality of 3D shapes is required. Due to the nature of 3D representation, this quality assessment may take various forms. While it is straightforward to measure geometric distortions directly on the 3D shape geometry, such measures are often inconsistent with human perception of quality. In most cases, human viewers tend to perceive 3D shapes from their 2D renderings. It is therefore plausible to measure shape quality using their 2D renderings. In this paper, we present an image-based quality metric for evaluating 3D shape quality given the original and distorted shapes. To provide a good coverage of 3D geometry from different views, we render each shape from 12 equally spaced views, along with a variety of rendering styles to capture different aspects of visual characteristics. Image-based metrics such as SSIM (Structure Similarity Index Measure) are then used to measure the quality of 3D shapes. Our experiments show that by effectively selecting a suitable combination of rendering styles and building a neural network based model, we achieve significantly better prediction for subjective perceptual quality than existing methods
    corecore