1,241,959 research outputs found

    Video matching using DC-image and local features

    Get PDF
    This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity

    Hybrid image representation methods for automatic image annotation: a survey

    Get PDF
    In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

    Aggregated Deep Local Features for Remote Sensing Image Retrieval

    Get PDF
    Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal contributio

    Automatic discovery of image families: Global vs. local features

    Get PDF
    Gathering a large collection of images has been made quite easy by social and image sharing websites, e.g. flickr.com. However, using such collections faces the problem that they contain a large number of duplicates and highly similar images. This work tackles the problem of how to automatically organize image collections into sets of similar images, called image families hereinafter. We thoroughly compare the performance of two approaches to measure image similarity: global descriptors vs. a set of local descriptors. We assess the performance of these approaches as the problem scales up to thousands of images and hundreds of families. We present our results on a new dataset of CD/DVD game covers

    Exploiting Local Features from Deep Networks for Image Retrieval

    Full text link
    Deep convolutional neural networks have been successfully applied to image classification tasks. When these same networks have been applied to image retrieval, the assumption has been made that the last layers would give the best performance, as they do in classification. We show that for instance-level image retrieval, lower layers often perform better than the last layers in convolutional neural networks. We present an approach for extracting convolutional features from different layers of the networks, and adopt VLAD encoding to encode features into a single vector for each image. We investigate the effect of different layers and scales of input images on the performance of convolutional features using the recent deep networks OxfordNet and GoogLeNet. Experiments demonstrate that intermediate layers or higher layers with finer scales produce better results for image retrieval, compared to the last layer. When using compressed 128-D VLAD descriptors, our method obtains state-of-the-art results and outperforms other VLAD and CNN based approaches on two out of three test datasets. Our work provides guidance for transferring deep networks trained on image classification to image retrieval tasks.Comment: CVPR DeepVision Workshop 201

    DC-image for real time compressed video matching

    Get PDF
    This chapter presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without full decompression. In addition, the relevant arguments and supporting evidences are discussed. Several local feature detectors will be examined to select the best for matching using the DC-image. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and computation complexity. The second experiment compares between using local features and global features regarding compressed video matching with respect to the DC-image. The results confirmed that the use of DC-image, despite its highly reduced size, it is promising as it produces higher matching precision, compared to the full I-frame. Also, SIFT, as a local feature, outperforms most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin which leaves a space for further optimizations that can be done to improve this computation complexity
    • 

    corecore