27,739 research outputs found

    How is Gaze Influenced by Image Transformations? Dataset and Model

    Full text link
    Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive. Most of current studies on human attention and saliency modeling have used high quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets. Finally, we introduce a novel saliency model based on generative adversarial network (dubbed GazeGAN). A modified UNet is proposed as the generator of the GazeGAN, which combines classic skip connections with a novel center-surround connection (CSC), in order to leverage multi level features. We also propose a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in terms of luminance distribution. Extensive experiments and comparisons over 3 datasets indicate that GazeGAN achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations. Our code and data are available at: https://github.com/CZHQuality/Sal-CFS-GAN

    Masking Strategies for Image Manifolds

    Full text link
    We consider the problem of selecting an optimal mask for an image manifold, i.e., choosing a subset of the pixels of the image that preserves the manifold's geometric structure present in the original data. Such masking implements a form of compressive sensing through emerging imaging sensor platforms for which the power expense grows with the number of pixels acquired. Our goal is for the manifold learned from masked images to resemble its full image counterpart as closely as possible. More precisely, we show that one can indeed accurately learn an image manifold without having to consider a large majority of the image pixels. In doing so, we consider two masking methods that preserve the local and global geometric structure of the manifold, respectively. In each case, the process of finding the optimal masking pattern can be cast as a binary integer program, which is computationally expensive but can be approximated by a fast greedy algorithm. Numerical experiments show that the relevant manifold structure is preserved through the data-dependent masking process, even for modest mask sizes

    Full Reference Objective Quality Assessment for Reconstructed Background Images

    Full text link
    With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.Comment: Associated source code: https://github.com/ashrotre/RBQI, Associated Database: https://drive.google.com/drive/folders/1bg8YRPIBcxpKIF9BIPisULPBPcA5x-Bk?usp=sharing (Email for permissions at: ashrotreasuedu
    • …
    corecore