9,625 research outputs found
Packing and Padding: Coupled Multi-index for Accurate Image Retrieval
In Bag-of-Words (BoW) based image retrieval, the SIFT visual word has a low
discriminative power, so false positive matches occur prevalently. Apart from
the information loss during quantization, another cause is that the SIFT
feature only describes the local gradient distribution. To address this
problem, this paper proposes a coupled Multi-Index (c-MI) framework to perform
feature fusion at indexing level. Basically, complementary features are coupled
into a multi-dimensional inverted index. Each dimension of c-MI corresponds to
one kind of feature, and the retrieval process votes for images similar in both
SIFT and other feature spaces. Specifically, we exploit the fusion of local
color feature into c-MI. While the precision of visual match is greatly
enhanced, we adopt Multiple Assignment to improve recall. The joint cooperation
of SIFT and color features significantly reduces the impact of false positive
matches.
Extensive experiments on several benchmark datasets demonstrate that c-MI
improves the retrieval accuracy significantly, while consuming only half of the
query time compared to the baseline. Importantly, we show that c-MI is well
complementary to many prior techniques. Assembling these methods, we have
obtained an mAP of 85.8% and N-S score of 3.85 on Holidays and Ukbench
datasets, respectively, which compare favorably with the state-of-the-arts.Comment: 8 pages, 7 figures, 6 tables. Accepted to CVPR 201
Learning midlevel image features for natural scene and texture classification
This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio
VITALAS at TRECVID-2008
In this paper, we present our experiments in TRECVID 2008 about High-Level feature extraction task. This is the first year for our participation in TRECVID, our system adopts some popular approaches that other workgroups proposed before. We proposed 2 advanced low-level features NEW Gabor texture descriptor and the Compact-SIFT Codeword histogram. Our system applied well-known LIBSVM to train the SVM classifier for the basic classifier. In fusion step, some methods were employed such as the Voting, SVM-base, HCRF and Bootstrap Average AdaBoost(BAAB)
FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search
We present FLASH (\textbf{F}ast \textbf{L}SH \textbf{A}lgorithm for
\textbf{S}imilarity search accelerated with \textbf{H}PC), a similarity search
system for ultra-high dimensional datasets on a single machine, that does not
require similarity computations and is tailored for high-performance computing
platforms. By leveraging a LSH style randomized indexing procedure and
combining it with several principled techniques, such as reservoir sampling,
recent advances in one-pass minwise hashing, and count based estimations, we
reduce the computational and parallelization costs of similarity search, while
retaining sound theoretical guarantees.
We evaluate FLASH on several real, high-dimensional datasets from different
domains, including text, malicious URL, click-through prediction, social
networks, etc. Our experiments shed new light on the difficulties associated
with datasets having several million dimensions. Current state-of-the-art
implementations either fail on the presented scale or are orders of magnitude
slower than FLASH. FLASH is capable of computing an approximate k-NN graph,
from scratch, over the full webspam dataset (1.3 billion nonzeros) in less than
10 seconds. Computing a full k-NN graph in less than 10 seconds on the webspam
dataset, using brute-force (), will require at least 20 teraflops. We
provide CPU and GPU implementations of FLASH for replicability of our results
- âŠ