3,362 research outputs found

    Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method

    Get PDF
    [[abstract]]Feature descriptor matching plays an important role in many computer vision applications. This paper presents a novel fast linear exhaustive search algorithm combined with a multi-resolution candidate elimination technique to deal with this problem efficiently. The proposed algorithm is inspired from the existing multi-resolution image retrieval approaches, but releasing the requirement on a norm-sorted database with pre-computed multi-resolution tables. This helps to increase the applicability of the proposed method. Moreover, the computations of candidate elimination are fully performed using a simple L1 distance metric, which is able to speedup the entire search process without loss of accuracy. This property leads to an accurate feature descriptor matching algorithm with real-time performance, which will be validated in the experiments by testing with the matching of SURF descriptors.[[booktype]]ç´™

    Video matching using DC-image and local features

    Get PDF
    This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity

    CBCD Based on Color Features and Landmark MDS-Assisted Distance Estimation

    Get PDF
    Content-Based Copy Detection (CBCD) of digital videos is an important research field that aims at the identification of modified copies of an original clip, e.g., on the Internet. In this application, the video content is uniquely identified by the content itself, by extracting some compact features that are robust to a certain set of video transformations. Given the huge amount of data present in online video databases, the computational complexity of the feature extraction and comparison is a very important issue. In this paper, a landmark based multi-dimensional scaling technique is proposed to speed up the detection procedure which is based on exhaustive search and the MPEG-7 Dominant Color Descriptor. The method is evaluated under the MPEG Video Signature Core Experiment conditions, and simulation results show impressive time savings at the cost of a slightly reduced detection performance

    Particular object retrieval with integral max-pooling of CNN activations

    Get PDF
    Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations. Yet such models are not compatible with geometry-aware re-ranking methods and still outperformed, on some particular object retrieval benchmarks, by traditional image search systems relying on precise descriptor matching, geometric re-ranking, or query expansion. This work revisits both retrieval stages, namely initial search and re-ranking, by employing the same primitive information derived from the CNN. We build compact feature vectors that encode several image regions without the need to feed multiple inputs to the network. Furthermore, we extend integral images to handle max-pooling on convolutional layer activations, allowing us to efficiently localize matching objects. The resulting bounding box is finally used for image re-ranking. As a result, this paper significantly improves existing CNN-based recognition pipeline: We report for the first time results competing with traditional methods on the challenging Oxford5k and Paris6k datasets
    • …
    corecore