12,006 research outputs found

    Content-based image retrieval of museum images

    Get PDF
    Content-based image retrieval (CBIR) is becoming more and more important with the advance of multimedia and imaging technology. Among many retrieval features associated with CBIR, texture retrieval is one of the most difficult. This is mainly because no satisfactory quantitative definition of texture exists at this time, and also because of the complex nature of the texture itself. Another difficult problem in CBIR is query by low-quality images, which means attempts to retrieve images using a poor quality image as a query. Not many content-based retrieval systems have addressed the problem of query by low-quality images. Wavelet analysis is a relatively new and promising tool for signal and image analysis. Its time-scale representation provides both spatial and frequency information, thus giving extra information compared to other image representation schemes. This research aims to address some of the problems of query by texture and query by low quality images by exploiting all the advantages that wavelet analysis has to offer, particularly in the context of museum image collections. A novel query by low-quality images algorithm is presented as a solution to the problem of poor retrieval performance using conventional methods. In the query by texture problem, this thesis provides a comprehensive evaluation on wavelet-based texture method as well as comparison with other techniques. A novel automatic texture segmentation algorithm and an improved block oriented decomposition is proposed for use in query by texture. Finally all the proposed techniques are integrated in a content-based image retrieval application for museum image collections

    Query by String word spotting based on character bi-gram indexing

    Full text link
    In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasetsComment: To be published in ICDAR201

    Physical Primitive Decomposition

    Full text link
    Objects are made of parts, each with distinct geometry, physics, functionality, and affordances. Developing such a distributed, physical, interpretable representation of objects will facilitate intelligent agents to better explore and interact with the world. In this paper, we study physical primitive decomposition---understanding an object through its components, each with physical and geometric attributes. As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object's appearance and its behaviors in physical events. Our model performs well on block towers and tools in both synthetic and real scenarios; we also demonstrate that visual and physical observations often provide complementary signals. We further present ablation and behavioral studies to better understand our model and contrast it with human performance.Comment: ECCV 2018. Project page: http://ppd.csail.mit.edu

    Structured Knowledge Representation for Image Retrieval

    Full text link
    We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval

    Gradient-orientation-based PCA subspace for novel face recognition

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.Face recognition is an interesting and a challenging problem that has been widely studied in the field of pattern recognition and computer vision. It has many applications such as biometric authentication, video surveillance, and others. In the past decade, several methods for face recognition were proposed. However, these methods suffer from pose and illumination variations. In order to address these problems, this paper proposes a novel methodology to recognize the face images. Since image gradients are invariant to illumination and pose variations, the proposed approach uses gradient orientation to handle these effects. The Schur decomposition is used for matrix decomposition and then Schurvalues and Schurvectors are extracted for subspace projection. We call this subspace projection of face features as Schurfaces, which is numerically stable and have the ability of handling defective matrices. The Hausdorff distance is used with the nearest neighbor classifier to measure the similarity between different faces. Experiments are conducted with Yale face database and ORL face database. The results show that the proposed approach is highly discriminant and achieves a promising accuracy for face recognition than the state-of-the-art approaches

    Adaptive Edge-Oriented Shot Boundary Detection

    Get PDF
    We study the problem of video shot boundary detection using an adaptive edge-oriented framework. Our approach is distinct in its use of multiple multilevel features in the required processing. Adaptation is provided by a careful analysis of these multilevel features, based on shot variability. We consider three levels of adaptation: at the feature extraction stage using locally-adaptive edge maps, at the video sequence level, and at the individual shot level. We show how to provide adaptive parameters for the multilevel edge-based approach, and how to determine adaptive thresholds for the shot boundaries based on the characteristics of the particular shot being indexed. The result is a fast adaptive scheme that provides a slightly better performance in terms of robustness, and a five fold efficiency improvement in shot characterization and classification. The reported work has applications beyond direct video indexing, and could be used in real-time applications, such as in dynamic monitoring and modeling of video data traffic in multimedia communications, and in real-time video surveillance. Experimental results are included

    Deep Learning Perspectives on Efficient Image Matching in Natural Image Databases

    Get PDF
    With the proliferation of digital content, efficient image matching in natural image databases has become paramount. Traditional image matching techniques, while effective to a certain extent, face challenges in dealing with the high variability inherent in natural images. This research delves into the application of deep learning models, particularly Convolutional Neural Networks (CNNs), Siamese Networks, and Triplet Networks, to address these challenges. We introduce various techniques to enhance efficiency, such as data augmentation, transfer learning, dimensionality reduction, efficient sampling, and the amalgamation of traditional computer vision strategies with deep learning. Our experimental results, garnered from specific dataset, demonstrate significant improvements in image matching efficiency, as quantified by metrics like precision, recall, F1-Score, and matching time. The findings underscore the potential of deep learning as a transformative tool for natural image database matching, setting the stage for further research and optimization in this domain
    • …
    corecore