194 research outputs found

    Packing and Padding: Coupled Multi-index for Accurate Image Retrieval

    Full text link
    In Bag-of-Words (BoW) based image retrieval, the SIFT visual word has a low discriminative power, so false positive matches occur prevalently. Apart from the information loss during quantization, another cause is that the SIFT feature only describes the local gradient distribution. To address this problem, this paper proposes a coupled Multi-Index (c-MI) framework to perform feature fusion at indexing level. Basically, complementary features are coupled into a multi-dimensional inverted index. Each dimension of c-MI corresponds to one kind of feature, and the retrieval process votes for images similar in both SIFT and other feature spaces. Specifically, we exploit the fusion of local color feature into c-MI. While the precision of visual match is greatly enhanced, we adopt Multiple Assignment to improve recall. The joint cooperation of SIFT and color features significantly reduces the impact of false positive matches. Extensive experiments on several benchmark datasets demonstrate that c-MI improves the retrieval accuracy significantly, while consuming only half of the query time compared to the baseline. Importantly, we show that c-MI is well complementary to many prior techniques. Assembling these methods, we have obtained an mAP of 85.8% and N-S score of 3.85 on Holidays and Ukbench datasets, respectively, which compare favorably with the state-of-the-arts.Comment: 8 pages, 7 figures, 6 tables. Accepted to CVPR 201

    Design, implementation, and evaluation of scalable content-based image retrieval techniques.

    Get PDF
    Wong, Yuk Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 95-100).Abstracts in English and Chinese.Abstract --- p.iiAcknowledgement --- p.vChapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview --- p.1Chapter 1.2 --- Contribution --- p.3Chapter 1.3 --- Organization of This Work --- p.5Chapter 2 --- Literature Review --- p.6Chapter 2.1 --- Content-based Image Retrieval --- p.6Chapter 2.1.1 --- Query Technique --- p.6Chapter 2.1.2 --- Relevance Feedback --- p.7Chapter 2.1.3 --- Previously Proposed CBIR systems --- p.7Chapter 2.2 --- Invariant Local Feature --- p.8Chapter 2.3 --- Invariant Local Feature Detector --- p.9Chapter 2.3.1 --- Harris Corner Detector --- p.9Chapter 2.3.2 --- DOG Extrema Detector --- p.10Chapter 2.3.3 --- Harris-Laplacian Corner Detector --- p.13Chapter 2.3.4 --- Harris-Affine Covariant Detector --- p.14Chapter 2.4 --- Invariant Local Feature Descriptor --- p.15Chapter 2.4.1 --- Scale Invariant Feature Transform (SIFT) --- p.15Chapter 2.4.2 --- Shape Context --- p.17Chapter 2.4.3 --- PCA-SIFT --- p.18Chapter 2.4.4 --- Gradient Location and Orientation Histogram (GLOH) --- p.19Chapter 2.4.5 --- Geodesic-Intensity Histogram (GIH) --- p.19Chapter 2.4.6 --- Experiment --- p.21Chapter 2.5 --- Feature Matching --- p.27Chapter 2.5.1 --- Matching Criteria --- p.27Chapter 2.5.2 --- Distance Measures --- p.28Chapter 2.5.3 --- Searching Techniques --- p.29Chapter 3 --- A Distributed Scheme for Large-Scale CBIR --- p.31Chapter 3.1 --- Overview --- p.31Chapter 3.2 --- Related Work --- p.33Chapter 3.3 --- Scalable Content-Based Image Retrieval Scheme --- p.34Chapter 3.3.1 --- Overview of Our Solution --- p.34Chapter 3.3.2 --- Locality-Sensitive Hashing --- p.34Chapter 3.3.3 --- Scalable Indexing Solutions --- p.35Chapter 3.3.4 --- Disk-Based Multi-Partition Indexing --- p.36Chapter 3.3.5 --- Parallel Multi-Partition Indexing --- p.37Chapter 3.4 --- Feature Representation --- p.43Chapter 3.5 --- Empirical Evaluation --- p.44Chapter 3.5.1 --- Experimental Testbed --- p.44Chapter 3.5.2 --- Performance Evaluation Metrics --- p.44Chapter 3.5.3 --- Experimental Setup --- p.45Chapter 3.5.4 --- Experiment I: Disk-Based Multi-Partition Indexing Approach --- p.45Chapter 3.5.5 --- Experiment II: Parallel-Based Multi-Partition Indexing Approach --- p.48Chapter 3.6 --- Application to WWW Image Retrieval --- p.55Chapter 3.7 --- Summary --- p.55Chapter 4 --- Image Retrieval System for IND Detection --- p.60Chapter 4.1 --- Overview --- p.60Chapter 4.1.1 --- Motivation --- p.60Chapter 4.1.2 --- Related Work --- p.61Chapter 4.1.3 --- Objective --- p.62Chapter 4.1.4 --- Contribution --- p.63Chapter 4.2 --- Database Construction --- p.63Chapter 4.2.1 --- Image Representations --- p.63Chapter 4.2.2 --- Index Construction --- p.64Chapter 4.2.3 --- Keypoint and Image Lookup Tables --- p.67Chapter 4.3 --- Database Query --- p.67Chapter 4.3.1 --- Matching Strategies --- p.68Chapter 4.3.2 --- Verification Processes --- p.71Chapter 4.3.3 --- Image Voting --- p.75Chapter 4.4 --- Performance Evaluation --- p.76Chapter 4.4.1 --- Evaluation Metrics --- p.76Chapter 4.4.2 --- Results --- p.77Chapter 4.4.3 --- Summary --- p.81Chapter 5 --- Shape-SIFT Feature Descriptor --- p.82Chapter 5.1 --- Overview --- p.82Chapter 5.2 --- Related Work --- p.83Chapter 5.3 --- SHAPE-SIFT Descriptors --- p.84Chapter 5.3.1 --- Orientation assignment --- p.84Chapter 5.3.2 --- Canonical orientation determination --- p.84Chapter 5.3.3 --- Keypoint descriptor --- p.87Chapter 5.4 --- Performance Evaluation --- p.88Chapter 5.5 --- Summary --- p.90Chapter 6 --- Conclusions and Future Work --- p.92Chapter 6.1 --- Conclusions --- p.92Chapter 6.2 --- Future Work --- p.93Chapter A --- Publication --- p.94Bibliography --- p.9

    Real-time near replica detection over massive streams of shared photos

    Get PDF
    Aquest treball es basa en la detecció en temps real de repliques d'imatges en entorns distribuïts a partir de la indexació de vectors de característiques locals

    Registration and categorization of camera captured documents

    Get PDF
    Camera captured document image analysis concerns with processing of documents captured with hand-held sensors, smart phones, or other capturing devices using advanced image processing, computer vision, pattern recognition, and machine learning techniques. As there is no constrained capturing in the real world, the captured documents suffer from illumination variation, viewpoint variation, highly variable scale/resolution, background clutter, occlusion, and non-rigid deformations e.g., folds and crumples. Document registration is a problem where the image of a template document whose layout is known is registered with a test document image. Literature in camera captured document mosaicing addressed the registration of captured documents with the assumption of considerable amount of single chunk overlapping content. These methods cannot be directly applied to registration of forms, bills, and other commercial documents where the fixed content is distributed into tiny portions across the document. On the other hand, most of the existing document image registration methods work with scanned documents under affine transformation. Literature in document image retrieval addressed categorization of documents based on text, figures, etc. However, the scalability of existing document categorization methodologies based on logo identification is very limited. This dissertation focuses on two problems (i) registration of captured documents where the overlapping content is distributed into tiny portions across the documents and (ii) categorization of captured documents into predefined logo classes that scale to large datasets using local invariant features. A novel methodology is proposed for the registration of user defined Regions Of Interest (ROI) using corresponding local features from their neighborhood. The methodology enhances prior approaches in point pattern based registration, like RANdom SAmple Consensus (RANSAC) and Thin Plate Spline-Robust Point Matching (TPS-RPM), to enable registration of cell phone and camera captured documents under non-rigid transformations. Three novel aspects are embedded into the methodology: (i) histogram based uniformly transformed correspondence estimation, (ii) clustering of points located near the ROI to select only close by regions for matching, and (iii) validation of the registration in RANSAC and TPS-RPM algorithms. Experimental results on a dataset of 480 images captured using iPhone 3GS and Logitech webcam Pro 9000 have shown an average registration accuracy of 92.75% using Scale Invariant Feature Transform (SIFT). Robust local features for logo identification are determined empirically by comparisons among SIFT, Speeded-Up Robust Features (SURF), Hessian-Affine, Harris-Affine, and Maximally Stable Extremal Regions (MSER). Two different matching methods are presented for categorization: matching all features extracted from the query document as a single set and a segment-wise matching of query document features using segmentation achieved by grouping area under intersecting dense local affine covariant regions. The later approach not only gives an approximate location of predicted logo classes in the query document but also helps to increase the prediction accuracies. In order to facilitate scalability to large data sets, inverted indexing of logo class features has been incorporated in both approaches. Experimental results on a dataset of real camera captured documents have shown a peak 13.25% increase in the F–measure accuracy using the later approach as compared to the former
    • …
    corecore