1,495 research outputs found

    A Survey of 2D and 3D Shape Descriptors

    Get PDF

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Trademark image retrieval by local features

    Get PDF
    The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current operational trademark retrieval systems involve manual annotation of the images (the current ‘gold standard’). Accordingly, current systems require a substantial amount of time and labour to access, and are therefore expensive to operate. This thesis focuses on the development of algorithms that mimic aspects of human visual perception in order to retrieve similar abstract trademark images automatically. A significant category of trademark images are typically highly stylised, comprising a collection of distinctive graphical elements that often include geometric shapes. Therefore, in order to compare the similarity of such images the principal aim of this research has been to develop a method for solving the partial matching and shape perception problem. There are few useful techniques for partial shape matching in the context of trademark retrieval, because those existing techniques tend not to support multicomponent retrieval. When this work was initiated most trademark image retrieval systems represented images by means of global features, which are not suited to solving the partial matching problem. Instead, the author has investigated the use of local image features as a means to finding similarities between trademark images that only partially match in terms of their subcomponents. During the course of this work, it has been established that the Harris and Chabat detectors could potentially perform sufficiently well to serve as the basis for local feature extraction in trademark image retrieval. Early findings in this investigation indicated that the well established SIFT (Scale Invariant Feature Transform) local features, based on the Harris detector, could potentially serve as an adequate underlying local representation for matching trademark images. There are few researchers who have used mechanisms based on human perception for trademark image retrieval, implying that the shape representations utilised in the past to solve this problem do not necessarily reflect the shapes contained in these image, as characterised by human perception. In response, a ii practical approach to trademark image retrieval by perceptual grouping has been developed based on defining meta-features that are calculated from the spatial configurations of SIFT local image features. This new technique measures certain visual properties of the appearance of images containing multiple graphical elements and supports perceptual grouping by exploiting the non-accidental properties of their configuration. Our validation experiments indicated that we were indeed able to capture and quantify the differences in the global arrangement of sub-components evident when comparing stylised images in terms of their visual appearance properties. Such visual appearance properties, measured using 17 of the proposed metafeatures, include relative sub-component proximity, similarity, rotation and symmetry. Similar work on meta-features, based on the above Gestalt proximity, similarity, and simplicity groupings of local features, had not been reported in the current computer vision literature at the time of undertaking this work. We decided to adopted relevance feedback to allow the visual appearance properties of relevant and non-relevant images returned in response to a query to be determined by example. Since limited training data is available when constructing a relevance classifier by means of user supplied relevance feedback, the intrinsically non-parametric machine learning algorithm ID3 (Iterative Dichotomiser 3) was selected to construct decision trees by means of dynamic rule induction. We believe that the above approach to capturing high-level visual concepts, encoded by means of meta-features specified by example through relevance feedback and decision tree classification, to support flexible trademark image retrieval and to be wholly novel. The retrieval performance the above system was compared with two other state-of-the-art image trademark retrieval systems: Artisan developed by Eakins (Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using relevance feedback, our system achieves higher average normalised precision than either of the systems developed by Eakins’ or Jiang. However, while our trademark image query and database set is based on an image dataset used by Eakins, we employed different numbers of images. It was not possible to access to the same query set and image database used in the evaluation of Jiang’s trademark iii image retrieval system evaluation. Despite these differences in evaluation methodology, our approach would appear to have the potential to improve retrieval effectiveness

    Statistical/Geometric Techniques for Object Representation and Recognition

    Get PDF
    Object modeling and recognition are key areas of research in computer vision and graphics with wide range of applications. Though research in these areas is not new, traditionally most of it has focused on analyzing problems under controlled environments. The challenges posed by real life applications demand for more general and robust solutions. The wide variety of objects with large intra-class variability makes the task very challenging. The difficulty in modeling and matching objects also vary depending on the input modality. In addition, the easy availability of sensors and storage have resulted in tremendous increase in the amount of data that needs to be processed which requires efficient algorithms suitable for large-size databases. In this dissertation, we address some of the challenges involved in modeling and matching of objects in realistic scenarios. Object matching in images require accounting for large variability in the appearance due to changes in illumination and view point. Any real world object is characterized by its underlying shape and albedo, which unlike the image intensity are insensitive to changes in illumination conditions. We propose a stochastic filtering framework for estimating object albedo from a single intensity image by formulating the albedo estimation as an image estimation problem. We also show how this albedo estimate can be used for illumination insensitive object matching and for more accurate shape recovery from a single image using standard shape from shading formulation. We start with the simpler problem where the pose of the object is known and only the illumination varies. We then extend the proposed approach to handle unknown pose in addition to illumination variations. We also use the estimated albedo maps for another important application, which is recognizing faces across age progression. Many approaches which address the problem of modeling and recognizing objects from images assume that the underlying objects are of diffused texture. But most real world objects exhibit a combination of diffused and specular properties. We propose an approach for separating the diffused and specular reflectance from a given color image so that the algorithms proposed for objects of diffused texture become applicable to a much wider range of real world objects. Representing and matching the 2D and 3D geometry of objects is also an integral part of object matching with applications in gesture recognition, activity classification, trademark and logo recognition, etc. The challenge in matching 2D/3D shapes lies in accounting for the different rigid and non-rigid deformations, large intra-class variability, noise and outliers. In addition, since shapes are usually represented as a collection of landmark points, the shape matching algorithm also has to deal with the challenges of missing or unknown correspondence across these data points. We propose an efficient shape indexing approach where the different feature vectors representing the shape are mapped to a hash table. For a query shape, we show how the similar shapes in the database can be efficiently retrieved without the need for establishing correspondence making the algorithm extremely fast and scalable. We also propose an approach for matching and registration of 3D point cloud data across unknown or missing correspondence using an implicit surface representation. Finally, we discuss possible future directions of this research

    Image Indexing and Retrieval

    Get PDF
    The amount of pictorial data has been growing enormously with the expansion of WWW. From the large number of images, it is very important for users to retrieve required images via an efficient and effective mechanism. To solve the image retrieval problem, many techniques have been devised addressing the requirement of different applications. Problem of the traditional methods of image indexing have led to the rise of interest in techniques for retrieving images on the basis of automatically derived features such as color, texture and shape
 a technology generally referred as Content-Based Image Retrieval (CBIR). After decade of intensive research, CBIR technology is now beginning to move out of the laboratory into the marketplace. However, the technology still lacks maturity and is not yet being used in a significant scale

    Data Management Challenges for Internet-scale 3D Search Engines

    Full text link
    This paper describes the most significant data-related challenges involved in building internet-scale 3D search engines. The discussion centers on the most pressing data management issues in this domain, including model acquisition, support for multiple file formats, asset versioning, data integrity errors, the data lifecycle, intellectual property, and the legality of web crawling. The paper also discusses numerous issues that fall under the rubric of trustworthy computing, including privacy, security, inappropriate content, and copying/remixing of assets. The goal of the paper is to provide an overview of these general issues, illustrated by empirical data drawn from the internet's largest operational search engine. While numerous works have been published on 3D information retrieval, this paper is the first to discuss the real-world challenges that arise in building practical search engines at scale.Comment: Second version, distributed by SIGIR Foru

    A Generic Software Library for Creating Multimedia Browse/Search Applications

    Get PDF
    PhDThis thesis surveys the field of browse/search interactions. The results of this study form the basis of a specification of a representation scheme and a library of access functions which facilitate the creation of information-rich multimedia applications. Evidence is provided for the hypothesis that browsing and searching are the extreme ends of a continuum of data access methods and that many browse/search interactions contain a mixture of both with the ratio varying as the interaction proceeds. These observations motivate the integration of browsing and search facilities so that applications can be built which exhibit both types of information access. This work is tailored to the area of consumer multimedia with a review of the constraints that this imposes on the authoring process and the applications themselves forming part of this work. The specification of the functionality of the function library, together with its implementation and testing are described in detail. The library has been evaluated by constructing a number of prototype applications which demonstrate the utility and scope of the library

    Logo recognition in videos: an automated brand analysis system

    Get PDF
    Every year companies spend a sizeable budget on marketing, a large portion of which is spent on advertisement of their product brands on TV broadcasts. These physical advertising artifacts are usually emblazoned with the companies' name, logo, and their trademark brand. Given these astronomical numbers, companies are extremely keen to verify that their brand has the level of visibility they expect for such expenditure. In other words advertisers, in particular, like to verify that their contracts with broadcasters are fulfilled as promised since the price of a commercial depends primarily on the popularity of the show it interrupts or sponsors. Such verifications are essential to major companies in order to justify advertising budgets and ensure their brands achieve the desired level of visibility. Currently, the verification of brand visibility occurs manually by human annotators who view a broadcast and annotate every appearance of a companies' trademark in the broadcast. In this thesis a novel brand logo analysis system which uses shape-based matching and scale invariant feature transform (SIFT) based matching on graphics processing unit (GPU) is proposed developed and tested. The system is described for detection and retrieval of trademark logos appearing in commercial videos. A compact representation of trademark logos and video frame content based on global (shape-based) and local (scale invariant feature transform (SIFT)) feature points is proposed. These representations can be used to robustly detect, recognize, localize, and retrieve trademarks as they appear in a variety of different commercial video types. Classification of trademarks is performed by using shaped-based matching and matching a set of SIFT feature descriptors for each trademark instance against the set of SIFT features detected in each frame of the video. Our system can automatically recognize the logos in video frames in order to summarize the logo content of the broadcast with the detected size, position and score. The output of the system can be used to summarize or check the time and duration of commercial video blocks on broadcast or on a DVD. Experimental results are provided, along with an analysis of the processed frames. Results show that our proposed technique is efficient and effectively recognizes and classifies trademark logos
    • 

    corecore