260 research outputs found

    Is there anything new to say about SIFT matching?

    Get PDF
    SIFT is a classical hand-crafted, histogram-based descriptor that has deeply influenced research on image matching for more than a decade. In this paper, a critical review of the aspects that affect SIFT matching performance is carried out, and novel descriptor design strategies are introduced and individually evaluated. These encompass quantization, binarization and hierarchical cascade filtering as means to reduce data storage and increase matching efficiency, with no significant loss of accuracy. An original contextual matching strategy based on a symmetrical variant of the usual nearest-neighbor ratio is discussed as well, that can increase the discriminative power of any descriptor. The paper then undertakes a comprehensive experimental evaluation of state-of-the-art hand-crafted and data-driven descriptors, also including the most recent deep descriptors. Comparisons are carried out according to several performance parameters, among which accuracy and space-time efficiency. Results are provided for both planar and non-planar scenes, the latter being evaluated with a new benchmark based on the concept of approximated patch overlap. Experimental evidence shows that, despite their age, SIFT and other hand-crafted descriptors, once enhanced through the proposed strategies, are ready to meet the future image matching challenges. We also believe that the lessons learned from this work will inspire the design of better hand-crafted and data-driven descriptors

    Efficient large-scale image search with a vocabulary tree

    Get PDF
    The task of searching and recognizing objects in images has become an important research topic in the area of image processing and computer vision. Looking for similar images in large datasets given an input query and responding as fast as possible is a very challenging task. In this work the Bag of Features approach is studied, and an implementation of the visual vocabulary tree method from Nist´er and Stew´enius is presented. Images are described using local invariant descriptor techniques and then indexed in a database using an inverted index for further queries. The descriptors are quantized according to a visual vocabulary, creating sparse vectors, which allows to compute very efficiently, for each query, a ranking of similarity for indexed images. The performance of the method is analyzed varying different factors, such as the parameters for the vocabulary tree construction, different techniques of local descriptors extraction and dimensionality reduction with PCA. It can be observed that the retrieval performance increases with a richer vocabulary and decays very slowly as the size of the dataset grows.Fil: Uriza, Esteban. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Gómez Fernández, Francisco Roberto. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Rais, Martín. Escuela Normal Superior de Cachan; Franci

    SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

    Get PDF
    In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

    Enhancement of the Fynbos Leaf Optical Recognition Application (FLORA-E)

    Get PDF
    Object perception, classification and similarity discernment are relatively effortless tasks in humans. The exact method by which the brain achieves these is not yet fully understood. Identification, classification and similarity inference are currently nontrivial tasks for machine learning enabled platforms, even more so for ones operating in real time applications. This dissertation conducted research on the use of machine learning algorithms in object identification and classification by designing and developing an artificially intelligent Fynbos Leaf Optical Recognition Application (FLORA) platform. Previous versions of FLORA (versions A through D) were designed to recognise Proteaceae fynbos leaves by extracting six digital morphological features, then using the k-nearest neighbour (k-NN) algorithm for classification, yielding an 86.6% accuracy. The methods utilised in FLORA-A to -D are ineffective when attempting to classify irregular structured objects with high variability, such as stems and leafy stems. A redesign of the classification algorithms in the latest version, FLORA-E, was therefore necessary to cater for irregular fynbos stems. Numerous algorithms and techniques are available that can be used to achieve this objective. Keypoint matching, moments analysis and image hashing are the three techniques which were investigated in this thesis for suitability in achieving fynbos stem and leaf classification. These techniques form active areas of research within the field of image processing and were chosen because of their affine transformation invariance and low computational complexity, making them suitable for real time classification applications. The resulting classification solution, designed from experimentation on the three techniques under investigation, is a keypoint matching – Hu moment hybrid algorithm who`s output is a similarity index (SI) score that is used to return a ranked list of potential matches. The algorithm showed a relatively high degree of match accuracy when run on both regular (leaves) and irregular (stems) objects. The algorithm successfully achieved a top 5 match rate of 76% for stems, 86% for leaves and 81% overall when tested using a database of 24 fynbos species (predominantly from the Proteaceae family), where each species had approximately 50 sample images. Experimental results show that Hu moment and keypoint classifiers are ideal for real time applications because of their fast-matching capabilities. This allowed the resulting hybrid algorithm to achieve a nominal computation time of ~0.78s per sample on the test apparatus setup for this thesis. The scientific objective of this thesis was to build an artificially intelligent platform capable of correctly classifying fynbos flora by conducting research on object identification and classification algorithms. However, the core driving factor is rooted in the need to promote conservation in the Cape Floristic Region (CFR). The FLORA project is an example of how science and technology can be used as effective tools in aiding conservation and environmental awareness efforts. The FLORA platform can also be a useful tool for professional botanists, conservationists and fynbos enthusiasts by giving them access to an indexed and readily available digital catalogue of fynbos species across the CFR

    Online Structured Learning for Real-Time Computer Vision Gaming Applications

    Get PDF
    In recent years computer vision has played an increasingly important role in the development of computer games, and it now features as one of the core technologies for many gaming platforms. The work in this thesis addresses three problems in real-time computer vision, all of which are motivated by their potential application to computer games. We rst present an approach for real-time 2D tracking of arbitrary objects. In common with recent research in this area we incorporate online learning to provide an appearance model which is able to adapt to the target object and its surrounding background during tracking. However, our approach moves beyond the standard framework of tracking using binary classication and instead integrates tracking and learning in a more principled way through the use of structured learning. As well as providing a more powerful framework for adaptive visual object tracking, our approach also outperforms state-of-the-art tracking algorithms on standard datasets. Next we consider the task of keypoint-based object tracking. We take the traditional pipeline of matching keypoints followed by geometric verication and show how this can be embedded into a structured learning framework in order to provide principled adaptivity to a given environment. We also propose an approximation method allowing us to take advantage of recently developed binary image descriptors, meaning our approach is suitable for real-time application even on low-powered portable devices. Experimentally, we clearly see the benet that online adaptation using structured learning can bring to this problem. Finally, we present an approach for approximately recovering the dense 3D structure of a scene which has been mapped by a simultaneous localisation and mapping system. Our approach is guided by the constraints of the low-powered portable hardware we are targeting, and we develop a system which coarsely models the scene using a small number of planes. To achieve this, we frame the task as a structured prediction problem and introduce online learning into our approach to provide adaptivity to a given scene. This allows us to use relatively simple multi-view information coupled with online learning of appearance to efficiently produce coarse reconstructions of a scene
    • …
    corecore