8,496 research outputs found
Object Discovery via Cohesion Measurement
Color and intensity are two important components in an image. Usually, groups
of image pixels, which are similar in color or intensity, are an informative
representation for an object. They are therefore particularly suitable for
computer vision tasks, such as saliency detection and object proposal
generation. However, image pixels, which share a similar real-world color, may
be quite different since colors are often distorted by intensity. In this
paper, we reinvestigate the affinity matrices originally used in image
segmentation methods based on spectral clustering. A new affinity matrix, which
is robust to color distortions, is formulated for object discovery. Moreover, a
Cohesion Measurement (CM) for object regions is also derived based on the
formulated affinity matrix. Based on the new Cohesion Measurement, a novel
object discovery method is proposed to discover objects latent in an image by
utilizing the eigenvectors of the affinity matrix. Then we apply the proposed
method to both saliency detection and object proposal generation. Experimental
results on several evaluation benchmarks demonstrate that the proposed CM based
method has achieved promising performance for these two tasks.Comment: 14 pages, 14 figure
Automatic Segmentation of Broadcast News Audio using Self Similarity Matrix
Generally audio news broadcast on radio is com- posed of music, commercials,
news from correspondents and recorded statements in addition to the actual news
read by the newsreader. When news transcripts are available, automatic
segmentation of audio news broadcast to time align the audio with the text
transcription to build frugal speech corpora is essential. We address the
problem of identifying segmentation in the audio news broadcast corresponding
to the news read by the newsreader so that they can be mapped to the text
transcripts. The existing techniques produce sub-optimal solutions when used to
extract newsreader read segments. In this paper, we propose a new technique
which is able to identify the acoustic change points reliably using an acoustic
Self Similarity Matrix (SSM). We describe the two pass technique in detail and
verify its performance on real audio news broadcast of All India Radio for
different languages.Comment: 4 pages, 5 image
Weakly Supervised Localization using Deep Feature Maps
Object localization is an important computer vision problem with a variety of
applications. The lack of large scale object-level annotations and the relative
abundance of image-level labels makes a compelling case for weak supervision in
the object localization task. Deep Convolutional Neural Networks are a class of
state-of-the-art methods for the related problem of object recognition. In this
paper, we describe a novel object localization algorithm which uses
classification networks trained on only image labels. This weakly supervised
method leverages local spatial and semantic patterns captured in the
convolutional layers of classification networks. We propose an efficient beam
search based approach to detect and localize multiple objects in images. The
proposed method significantly outperforms the state-of-the-art in standard
object localization data-sets with a 8 point increase in mAP scores
SBNet: Sparse Blocks Network for Fast Inference
Conventional deep convolutional neural networks (CNNs) apply convolution
operators uniformly in space across all feature maps for hundreds of layers -
this incurs a high computational cost for real-time applications. For many
problems such as object detection and semantic segmentation, we are able to
obtain a low-cost computation mask, either from a priori problem knowledge, or
from a low-resolution segmentation network. We show that such computation masks
can be used to reduce computation in the high-resolution main network. Variants
of sparse activation CNNs have previously been explored on small-scale tasks
and showed no degradation in terms of object classification accuracy, but often
measured gains in terms of theoretical FLOPs without realizing a practical
speed-up when compared to highly optimized dense convolution implementations.
In this work, we leverage the sparsity structure of computation masks and
propose a novel tiling-based sparse convolution algorithm. We verified the
effectiveness of our sparse CNN on LiDAR-based 3D object detection, and we
report significant wall-clock speed-ups compared to dense convolution without
noticeable loss of accuracy.Comment: 10 pages, CVPR 201
- …