293 research outputs found
A Reverse Hierarchy Model for Predicting Eye Fixations
A number of psychological and physiological evidences suggest that early
visual attention works in a coarse-to-fine way, which lays a basis for the
reverse hierarchy theory (RHT). This theory states that attention propagates
from the top level of the visual hierarchy that processes gist and abstract
information of input, to the bottom level that processes local details.
Inspired by the theory, we develop a computational model for saliency detection
in images. First, the original image is downsampled to different scales to
constitute a pyramid. Then, saliency on each layer is obtained by image
super-resolution reconstruction from the layer above, which is defined as
unpredictability from this coarse-to-fine reconstruction. Finally, saliency on
each layer of the pyramid is fused into stochastic fixations through a
probabilistic model, where attention initiates from the top layer and
propagates downward through the pyramid. Extensive experiments on two standard
eye-tracking datasets show that the proposed method can achieve competitive
results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). CVPR 201
SASA: Saliency-Aware Self-Adaptive Snapshot Compressive Imaging
The ability of snapshot compressive imaging (SCI) systems to efficiently
capture high-dimensional (HD) data depends on the advent of novel optical
designs to sample the HD data as two-dimensional (2D) compressed measurements.
Nonetheless, the traditional SCI scheme is fundamentally limited, due to the
complete disregard for high-level information in the sampling process. To
tackle this issue, in this paper, we pave the first mile toward the advanced
design of adaptive coding masks for SCI. Specifically, we propose an efficient
and effective algorithm to generate coding masks with the assistance of
saliency detection, in a low-cost and low-power fashion. Experiments
demonstrate the effectiveness and efficiency of our approach. Code is available
at: https://github.com/IndigoPurple/SASAComment: 5 pages, 4 figure
Multi-Channel Deep Networks for Block-Based Image Compressive Sensing
Incorporating deep neural networks in image compressive sensing (CS) receives
intensive attentions recently. As deep network approaches learn the inverse
mapping directly from the CS measurements, a number of models have to be
trained, each of which corresponds to a sampling rate. This may potentially
degrade the performance of image CS, especially when multiple sampling rates
are assigned to different blocks within an image. In this paper, we develop a
multi-channel deep network for block-based image CS with performance
significantly exceeding the current state-of-the-art methods. The significant
performance improvement of the model is attributed to block-based sampling
rates allocation and model-level removal of blocking artifacts. Specifically,
the image blocks with a variety of sampling rates can be reconstructed in a
single model by exploiting inter-block correlation. At the same time, the
initially reconstructed blocks are reassembled into a full image to remove
blocking artifacts within the network by unrolling a hand-designed block-based
CS algorithm. Experimental results demonstrate that the proposed method
outperforms the state-of-the-art CS methods by a large margin in terms of
objective metrics, PSNR, SSIM, and subjective visual quality.Comment: 12 pages, 8 figure
MB-RACS: Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network
Conventional compressed sensing (CS) algorithms typically apply a uniform
sampling rate to different image blocks. A more strategic approach could be to
allocate the number of measurements adaptively, based on each image block's
complexity. In this paper, we propose a Measurement-Bounds-based Rate-Adaptive
Image Compressed Sensing Network (MB-RACS) framework, which aims to adaptively
determine the sampling rate for each image block in accordance with traditional
measurement bounds theory. Moreover, since in real-world scenarios statistical
information about the original image cannot be directly obtained, we suggest a
multi-stage rate-adaptive sampling strategy. This strategy sequentially adjusts
the sampling ratio allocation based on the information gathered from previous
samplings. We formulate the multi-stage rate-adaptive sampling as a convex
optimization problem and address it using a combination of Newton's method and
binary search techniques. Additionally, we enhance our decoding process by
incorporating skip connections between successive iterations to facilitate a
richer transmission of feature information across iterations. Our experiments
demonstrate that the proposed MB-RACS method surpasses current leading methods,
with experimental evidence also underscoring the effectiveness of each module
within our proposed framework
Adaptive Sensing and Processing for Some Computer Vision Problems
This dissertation is concerned with adaptive sensing and processing in computer vision, specifically through the application of computer vision techniques to non-standard sensors.
In the first part, we adapt techniques designed to solve the classical computer vision problem of gradient-based surface reconstruction to the problem of phase unwrapping that presents itself in applications such as interferometric synthetic aperture radar. Specifically, we propose a new formulation of and solution to the classical two-dimensional phase unwrapping problem. As is usually done, we use the wrapped principal phase gradient field as a measurement of the absolute phase gradient field. Since this model rarely holds in practice, we explicitly enforce integrability of the gradient measurements through a sparse error-correction model. Using a novel energy-minimization functional, we formulate the phase unwrapping task as a generalized lasso problem. We then jointly estimate the absolute phase and the sparse measurement errors using the alternating direction method of multipliers (ADMM) algorithm. Using an interferometric synthetic aperture radar noise model, we evaluate our technique for several synthetic surfaces and compare the results to recently-proposed phase unwrapping techniques. Our method applies new ideas from convex optimization and sparse regularization to this well-studied problem.
In the second part, we consider the problem of controlling and processing measurements from a non-traditional, compressive sensing (CS) camera in real time. We focus on how to control the number of measurements it acquires such that this number remains proportional to the amount of foreground information currently present in the scene under observations. To this end, we provide two novel adaptive-rate CS strategies for sparse, time-varying signals using side information. The first method utilizes extra cross-validation measurements, and the second exploits extra low-resolution measurements. Unlike the majority of current CS techniques, we do not assume that we know an upper bound on the number of significant coefficients pertaining to the images that comprise the video sequence. Instead, we use the side information to predict this quantity for each upcoming image. Our techniques specify a fixed number of spatially-multiplexed CS measurements to acquire, and they adjust this quantity from image to image. Our strategies are developed in the specific context of background subtraction for surveillance video, and we experimentally validate the proposed methods on real video sequences.
Finally, we consider a problem motivated by the application of active pan-tilt-zoom (PTZ) camera control in response to visual saliency. We extend the classical notion of this concept to multi-image data collected using a stationary PTZ camera by requiring consistency: the property that each saliency map in the set of those that are generated should assign the same saliency value to distinct regions of the environment that appear in more than one image. We show that processing each image independently will often fail to provide a consistent measure of saliency, and that using an image mosaic to quantify saliency suffers from several drawbacks. We then propose ray saliency: a mosaic-free method for calculating a consistent measure of bottom-up saliency. Experimental results demonstrating the effectiveness of the proposed approach are presented
- …