4,569 research outputs found
Compressive Mining: Fast and Optimal Data Mining in the Compressed Domain
Real-world data typically contain repeated and periodic patterns. This
suggests that they can be effectively represented and compressed using only a
few coefficients of an appropriate basis (e.g., Fourier, Wavelets, etc.).
However, distance estimation when the data are represented using different sets
of coefficients is still a largely unexplored area. This work studies the
optimization problems related to obtaining the \emph{tightest} lower/upper
bound on Euclidean distances when each data object is potentially compressed
using a different set of orthonormal coefficients. Our technique leads to
tighter distance estimates, which translates into more accurate search,
learning and mining operations \textit{directly} in the compressed domain.
We formulate the problem of estimating lower/upper distance bounds as an
optimization problem. We establish the properties of optimal solutions, and
leverage the theoretical analysis to develop a fast algorithm to obtain an
\emph{exact} solution to the problem. The suggested solution provides the
tightest estimation of the -norm or the correlation. We show that typical
data-analysis operations, such as k-NN search or k-Means clustering, can
operate more accurately using the proposed compression and distance
reconstruction technique. We compare it with many other prevalent compression
and reconstruction techniques, including random projections and PCA-based
techniques. We highlight a surprising result, namely that when the data are
highly sparse in some basis, our technique may even outperform PCA-based
compression.
The contributions of this work are generic as our methodology is applicable
to any sequential or high-dimensional data as well as to any orthogonal data
transformation used for the underlying data compression scheme.Comment: 25 pages, 20 figures, accepted in VLD
Pruned Bit-Reversal Permutations: Mathematical Characterization, Fast Algorithms and Architectures
A mathematical characterization of serially-pruned permutations (SPPs)
employed in variable-length permuters and their associated fast pruning
algorithms and architectures are proposed. Permuters are used in many signal
processing systems for shuffling data and in communication systems as an
adjunct to coding for error correction. Typically only a small set of discrete
permuter lengths are supported. Serial pruning is a simple technique to alter
the length of a permutation to support a wider range of lengths, but results in
a serial processing bottleneck. In this paper, parallelizing SPPs is formulated
in terms of recursively computing sums involving integer floor and related
functions using integer operations, in a fashion analogous to evaluating
Dedekind sums. A mathematical treatment for bit-reversal permutations (BRPs) is
presented, and closed-form expressions for BRP statistics are derived. It is
shown that BRP sequences have weak correlation properties. A new statistic
called permutation inliers that characterizes the pruning gap of pruned
interleavers is proposed. Using this statistic, a recursive algorithm that
computes the minimum inliers count of a pruned BR interleaver (PBRI) in
logarithmic time complexity is presented. This algorithm enables parallelizing
a serial PBRI algorithm by any desired parallelism factor by computing the
pruning gap in lookahead rather than a serial fashion, resulting in significant
reduction in interleaving latency and memory overhead. Extensions to 2-D block
and stream interleavers, as well as applications to pruned fast Fourier
transforms and LTE turbo interleavers, are also presented. Moreover,
hardware-efficient architectures for the proposed algorithms are developed.
Simulation results demonstrate 3 to 4 orders of magnitude improvement in
interleaving time compared to existing approaches.Comment: 31 page
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …