98,509 research outputs found

    Colour appearance descriptors for image browsing and retrieval

    No full text
    In this paper, we focus on the development of whole-scene colour appearance descriptors for classification to be used in browsing applications. The descriptors can classify a whole-scene image into various categories of semantically-based colour appearance. Colour appearance is an important feature and has been extensively used in image-analysis, retrieval and classification. By using pre-existing global CIELAB colour histograms, firstly, we try to develop metrics for wholescene colour appearance: “colour strength”, “high/low lightness” and “multicoloured”. Secondly we propose methods using these metrics either alone or combined to classify whole-scene images into five categories of appearance: strong, pastel, dark, pale and multicoloured. Experiments show positive results and that the global colour histogram is actually useful and can be used for whole-scene colour appearance classification. We have also conducted a small-scale human evaluation test on whole-scene colour appearance. The results show, with suitable threshold settings, the proposed methods can describe the whole-scene colour appearance of images close to human classification. The descriptors were tested on thousands of images from various scenes: paintings, natural scenes, objects, photographs and documents. The colour appearance classifications are being integrated into an image browsing system which allows them to also be used to refine browsing

    Query generation from multiple media examples

    Get PDF
    This paper exploits an unified media document representation called feature terms for query generation from multiple media examples, e.g. images. A feature term refers to a value interval of a media feature. A media document is therefore represented by a frequency vector about feature term appearance. This approach (1) facilitates feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three statistical criteria, minimised chi-squared, minimised AC/DC rate and maximised entropy, are proposed to extract feature terms from a given media document collection. Two textual ranking functions, KL divergence and a BM25-like retrieval model, are adapted to estimate media document relevance. Experiments on the Corel photo collection and the TRECVid 2006 collection show the effectiveness of feature term based query in image and video retrieval

    The Parallel Distributed Image Search Engine (ParaDISE)

    Get PDF
    Image retrieval is a complex task that differs according to the context and the user requirements in any specific field, for example in a medical environment. Search by text is often not possible or optimal and retrieval by the visual content does not always succeed in modelling high-level concepts that a user is looking for. Modern image retrieval techniques consists of multiple steps and aim to retrieve information from large–scale datasets and not only based on global image appearance but local features and if possible in a connection between visual features and text or semantics. This paper presents the Parallel Distributed Image Search Engine (ParaDISE), an image retrieval system that combines visual search with text–based retrieval and that is available as open source and free of charge. The main design concepts of ParaDISE are flexibility, expandability, scalability and interoperability. These concepts constitute the system, able to be used both in real–world applications and as an image retrieval research platform. Apart from the architecture and the implementation of the system, two use cases are described, an application of ParaDISE in retrieval of images from the medical literature and a visual feature evaluation for medical image retrieval. Future steps include the creation of an open source community that will contribute and expand this platform based on the existing parts

    An Appearance-Based Framework for 3D Hand Shape Classification and Camera Viewpoint Estimation

    Full text link
    An appearance-based framework for 3D hand shape classification and simultaneous camera viewpoint estimation is presented. Given an input image of a segmented hand, the most similar matches from a large database of synthetic hand images are retrieved. The ground truth labels of those matches, containing hand shape and camera viewpoint information, are returned by the system as estimates for the input image. Database retrieval is done hierarchically, by first quickly rejecting the vast majority of all database views, and then ranking the remaining candidates in order of similarity to the input. Four different similarity measures are employed, based on edge location, edge orientation, finger location and geometric moments.National Science Foundation (IIS-9912573, EIA-9809340

    Trademark image retrieval by local features

    Get PDF
    The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current operational trademark retrieval systems involve manual annotation of the images (the current ‘gold standard’). Accordingly, current systems require a substantial amount of time and labour to access, and are therefore expensive to operate. This thesis focuses on the development of algorithms that mimic aspects of human visual perception in order to retrieve similar abstract trademark images automatically. A significant category of trademark images are typically highly stylised, comprising a collection of distinctive graphical elements that often include geometric shapes. Therefore, in order to compare the similarity of such images the principal aim of this research has been to develop a method for solving the partial matching and shape perception problem. There are few useful techniques for partial shape matching in the context of trademark retrieval, because those existing techniques tend not to support multicomponent retrieval. When this work was initiated most trademark image retrieval systems represented images by means of global features, which are not suited to solving the partial matching problem. Instead, the author has investigated the use of local image features as a means to finding similarities between trademark images that only partially match in terms of their subcomponents. During the course of this work, it has been established that the Harris and Chabat detectors could potentially perform sufficiently well to serve as the basis for local feature extraction in trademark image retrieval. Early findings in this investigation indicated that the well established SIFT (Scale Invariant Feature Transform) local features, based on the Harris detector, could potentially serve as an adequate underlying local representation for matching trademark images. There are few researchers who have used mechanisms based on human perception for trademark image retrieval, implying that the shape representations utilised in the past to solve this problem do not necessarily reflect the shapes contained in these image, as characterised by human perception. In response, a ii practical approach to trademark image retrieval by perceptual grouping has been developed based on defining meta-features that are calculated from the spatial configurations of SIFT local image features. This new technique measures certain visual properties of the appearance of images containing multiple graphical elements and supports perceptual grouping by exploiting the non-accidental properties of their configuration. Our validation experiments indicated that we were indeed able to capture and quantify the differences in the global arrangement of sub-components evident when comparing stylised images in terms of their visual appearance properties. Such visual appearance properties, measured using 17 of the proposed metafeatures, include relative sub-component proximity, similarity, rotation and symmetry. Similar work on meta-features, based on the above Gestalt proximity, similarity, and simplicity groupings of local features, had not been reported in the current computer vision literature at the time of undertaking this work. We decided to adopted relevance feedback to allow the visual appearance properties of relevant and non-relevant images returned in response to a query to be determined by example. Since limited training data is available when constructing a relevance classifier by means of user supplied relevance feedback, the intrinsically non-parametric machine learning algorithm ID3 (Iterative Dichotomiser 3) was selected to construct decision trees by means of dynamic rule induction. We believe that the above approach to capturing high-level visual concepts, encoded by means of meta-features specified by example through relevance feedback and decision tree classification, to support flexible trademark image retrieval and to be wholly novel. The retrieval performance the above system was compared with two other state-of-the-art image trademark retrieval systems: Artisan developed by Eakins (Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using relevance feedback, our system achieves higher average normalised precision than either of the systems developed by Eakins’ or Jiang. However, while our trademark image query and database set is based on an image dataset used by Eakins, we employed different numbers of images. It was not possible to access to the same query set and image database used in the evaluation of Jiang’s trademark iii image retrieval system evaluation. Despite these differences in evaluation methodology, our approach would appear to have the potential to improve retrieval effectiveness

    Image Retrieval Based on Texton Frequency-Inverse Image Frequency

    Get PDF
    In image retrieval, the user hopes to find the desired image by entering another image as a query. In this paper, the approach used to find similarities between images is feature weighting, where between one feature with another feature has a different weight. Likewise, the same features in different images may have different weights. This approach is similar to the term weighting model that usually implemented in document retrieval, where the system will search for keywords from each document and then give different weights to each keyword. In this research, the method of weighting the TF-IIF (Texton Frequency-Inverse Image Frequency) method proposed, this method will extract critical features in an image based on the frequency of the appearance of texton in an image, and the appearance of the texton in another image. That is, the more often a texton appears in an image, and the less texton appears in another image, the higher the weight. The results obtained indicate that the proposed method can increase the value of precision by 7% compared to the previous method
    • …
    corecore