3,400 research outputs found
Screen Content Image Segmentation Using Sparse-Smooth Decomposition
Sparse decomposition has been extensively used for different applications
including signal compression and denoising and document analysis. In this
paper, sparse decomposition is used for image segmentation. The proposed
algorithm separates the background and foreground using a sparse-smooth
decomposition technique such that the smooth and sparse components correspond
to the background and foreground respectively. This algorithm is tested on
several test images from HEVC test sequences and is shown to have superior
performance over other methods, such as the hierarchical k-means clustering in
DjVu. This segmentation algorithm can also be used for text extraction, video
compression and medical image segmentation.Comment: Asilomar Conference on Signals, Systems and Computers, IEEE, 2015,
(to Appear
Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features
We propose a simple yet effective approach to the problem of pedestrian
detection which outperforms the current state-of-the-art. Our new features are
built on the basis of low-level visual features and spatial pooling.
Incorporating spatial pooling improves the translational invariance and thus
the robustness of the detection process. We then directly optimise the partial
area under the ROC curve (\pAUC) measure, which concentrates detection
performance in the range of most practical importance. The combination of these
factors leads to a pedestrian detector which outperforms all competitors on all
of the standard benchmark datasets. We advance state-of-the-art results by
lowering the average miss rate from to on the INRIA benchmark,
to on the ETH benchmark, to on the TUD-Brussels
benchmark and to on the Caltech-USA benchmark.Comment: 16 pages. Appearing in Proc. European Conf. Computer Vision (ECCV)
201
Image Deblurring and Super-resolution by Adaptive Sparse Domain Selection and Adaptive Regularization
As a powerful statistical image modeling technique, sparse representation has
been successfully used in various image restoration applications. The success
of sparse representation owes to the development of l1-norm optimization
techniques, and the fact that natural images are intrinsically sparse in some
domain. The image restoration quality largely depends on whether the employed
sparse domain can represent well the underlying image. Considering that the
contents can vary significantly across different images or different patches in
a single image, we propose to learn various sets of bases from a pre-collected
dataset of example image patches, and then for a given patch to be processed,
one set of bases are adaptively selected to characterize the local sparse
domain. We further introduce two adaptive regularization terms into the sparse
representation framework. First, a set of autoregressive (AR) models are
learned from the dataset of example image patches. The best fitted AR models to
a given patch are adaptively selected to regularize the image local structures.
Second, the image non-local self-similarity is introduced as another
regularization term. In addition, the sparsity regularization parameter is
adaptively estimated for better image restoration performance. Extensive
experiments on image deblurring and super-resolution validate that by using
adaptive sparse domain selection and adaptive regularization, the proposed
method achieves much better results than many state-of-the-art algorithms in
terms of both PSNR and visual perception.Comment: 35 pages. This paper is under review in IEEE TI
Low-Cost Compressive Sensing for Color Video and Depth
A simple and inexpensive (low-power and low-bandwidth) modification is made
to a conventional off-the-shelf color video camera, from which we recover
{multiple} color frames for each of the original measured frames, and each of
the recovered frames can be focused at a different depth. The recovery of
multiple frames for each measured frame is made possible via high-speed coding,
manifested via translation of a single coded aperture; the inexpensive
translation is constituted by mounting the binary code on a piezoelectric
device. To simultaneously recover depth information, a {liquid} lens is
modulated at high speed, via a variable voltage. Consequently, during the
aforementioned coding process, the liquid lens allows the camera to sweep the
focus through multiple depths. In addition to designing and implementing the
camera, fast recovery is achieved by an anytime algorithm exploiting the
group-sparsity of wavelet/DCT coefficients.Comment: 8 pages, CVPR 201
- …