3,613 research outputs found
Open Set Logo Detection and Retrieval
Current logo retrieval research focuses on closed set scenarios. We argue
that the logo domain is too large for this strategy and requires an open set
approach. To foster research in this direction, a large-scale logo dataset,
called Logos in the Wild, is collected and released to the public. A typical
open set logo retrieval application is, for example, assessing the
effectiveness of advertisement in sports event broadcasts. Given a query sample
in shape of a logo image, the task is to find all further occurrences of this
logo in a set of images or videos. Currently, common logo retrieval approaches
are unsuitable for this task because of their closed world assumption. Thus, an
open set logo retrieval method is proposed in this work which allows searching
for previously unseen logos by a single query sample. A two stage concept with
separate logo detection and comparison is proposed where both modules are based
on task specific CNNs. If trained with the Logos in the Wild data, significant
performance improvements are observed, especially compared with
state-of-the-art closed set approaches.Comment: accepted at VISAPP 201
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Retrieval System for Patent Images
AbstractPatent information and images play important roles to describe the novelty of an invention. However, current patent collections do not support image retrieval and patent images are become almost unsearchable. This paper presents a short review of the existing research work and challenges in patent image retrieval domain. From the review, the image feature extraction step is found to be an important step to match the query and database images successfully. In order to improve the current feature extraction step in image patent retrieval, we propose a patent image retrieval approach based on Affine-SIFT technique. Comparison discussions between the existing feature extraction techniques are presented to assess the potential of this proposed approach
Trademark image retrieval by local features
The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current
operational trademark retrieval systems involve manual annotation of the images
(the current ‘gold standard’). Accordingly, current systems require a substantial
amount of time and labour to access, and are therefore expensive to operate. This
thesis focuses on the development of algorithms that mimic aspects of human
visual perception in order to retrieve similar abstract trademark images
automatically. A significant category of trademark images are typically highly
stylised, comprising a collection of distinctive graphical elements that often
include geometric shapes. Therefore, in order to compare the similarity of such
images the principal aim of this research has been to develop a method for solving
the partial matching and shape perception problem.
There are few useful techniques for partial shape matching in the context of
trademark retrieval, because those existing techniques tend not to support multicomponent
retrieval. When this work was initiated most trademark image
retrieval systems represented images by means of global features, which are not
suited to solving the partial matching problem. Instead, the author has
investigated the use of local image features as a means to finding similarities
between trademark images that only partially match in terms of their subcomponents.
During the course of this work, it has been established that the
Harris and Chabat detectors could potentially perform sufficiently well to serve as
the basis for local feature extraction in trademark image retrieval. Early findings
in this investigation indicated that the well established SIFT (Scale Invariant
Feature Transform) local features, based on the Harris detector, could potentially
serve as an adequate underlying local representation for matching trademark
images.
There are few researchers who have used mechanisms based on human
perception for trademark image retrieval, implying that the shape representations
utilised in the past to solve this problem do not necessarily reflect the shapes
contained in these image, as characterised by human perception. In response, a
ii
practical approach to trademark image retrieval by perceptual grouping has been
developed based on defining meta-features that are calculated from the spatial
configurations of SIFT local image features. This new technique measures certain
visual properties of the appearance of images containing multiple graphical
elements and supports perceptual grouping by exploiting the non-accidental
properties of their configuration.
Our validation experiments indicated that we were indeed able to capture
and quantify the differences in the global arrangement of sub-components evident
when comparing stylised images in terms of their visual appearance properties.
Such visual appearance properties, measured using 17 of the proposed metafeatures,
include relative sub-component proximity, similarity, rotation and
symmetry. Similar work on meta-features, based on the above Gestalt proximity,
similarity, and simplicity groupings of local features, had not been reported in the
current computer vision literature at the time of undertaking this work.
We decided to adopted relevance feedback to allow the visual appearance
properties of relevant and non-relevant images returned in response to a query to
be determined by example. Since limited training data is available when
constructing a relevance classifier by means of user supplied relevance feedback,
the intrinsically non-parametric machine learning algorithm ID3 (Iterative
Dichotomiser 3) was selected to construct decision trees by means of dynamic
rule induction. We believe that the above approach to capturing high-level visual
concepts, encoded by means of meta-features specified by example through
relevance feedback and decision tree classification, to support flexible trademark
image retrieval and to be wholly novel.
The retrieval performance the above system was compared with two other
state-of-the-art image trademark retrieval systems: Artisan developed by Eakins
(Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using
relevance feedback, our system achieves higher average normalised precision
than either of the systems developed by Eakins’ or Jiang. However, while our
trademark image query and database set is based on an image dataset used by
Eakins, we employed different numbers of images. It was not possible to access to
the same query set and image database used in the evaluation of Jiang’s trademark
iii
image retrieval system evaluation. Despite these differences in evaluation
methodology, our approach would appear to have the potential to improve
retrieval effectiveness
Multi-label logo recognition and retrieval based on weighted fusion of neural features
Classifying logo images is a challenging task as they contain elements such as text or shapes that can represent anything from known objects to abstract shapes. While the current state of the art for logo classification addresses the problem as a multi-class task focusing on a single characteristic, logos can have several simultaneous labels, such as different colours. This work proposes a method that allows visually similar logos to be classified and searched from a set of data according to their shape, colour, commercial sector, semantics, general characteristics, or a combination of features selected by the user. Unlike previous approaches, the proposal employs a series of multi-label deep neural networks specialized in specific attributes and combines the obtained features to perform the similarity search. To delve into the classification system, different existing logo topologies are compared and some of their problems are analysed, such as the incomplete labelling that trademark registration databases usually contain. The proposal is evaluated considering 76,000 logos (seven times more than previous approaches) from the European Union Trademarks dataset, which is organized hierarchically using the Vienna ontology. Overall, experimentation attains reliable quantitative and qualitative results, reducing the normalized average rank error of the state-of-the-art from 0.040 to 0.018 for the Trademark Image Retrieval task. Finally, given that the semantics of logos can often be subjective, graphic design students and professionals were surveyed. Results show that the proposed methodology provides better labelling than a human expert operator, improving the label ranking average precision from 0.53 to 0.68.This work was supported by the Pattern Recognition and Artificial Intelligence Group (PRAIG) from the University of Alicante and the University Institute for Computing Research (IUII). The Conselleria d'Innovació, Universitats, Ciència I Societat Digital from Generalitat Valenciana and FEDER provided some of the computing resources used in this project through IDIFEDER/2020/003. This research was partially supported by the Conselleria de Educación, Universidades y Empleo, for the project "clasifIA" of the Escola Superior d'Art i Disseny d'Alacant
Methods and apparatus for constructing and implementing a universal extension module for processing objects in a database
Methods and apparatus for providing a multi-tier object-relational database architecture are disclosed. In one illustrative embodiment of the present invention, a multi-tier database architecture comprises an object-relational database engine as a top tier, one or more domain-specific extension modules as a bottom tier, and one or more universal extension modules as a middle tier. The individual extension modules of the bottom tier operationally connect with the one or more universal extension modules which, themselves, operationally connect with the database engine. The domain-specific extension modules preferably provide such functions as search, index, and retrieval services of images, video, audio, time series, web pages, text, XML, spatial data, etc. The domain-specific extension modules may include one or more IBM DB2 extenders, Oracle data cartridges and/or Informix datablades, although other domain-specific extension modules may be used
Multi-Object Shape Retrieval Using Curvature Trees
This work presents a geometry-based image retrieval approach for multi-object images. We commence with developing an effective shape matching method for closed boundaries. Then, a structured representation, called curvature tree (CT), is introduced to extend the shape matching approach to handle images containing multiple objects with possible holes. We also propose an algorithm, based on Gestalt principles, to detect and extract high-level boundaries (or envelopes), which may evolve as a result of the spatial arrangement of a group of image objects.
At first, a shape retrieval method using triangle-area representation (TAR) is presented for non-rigid shapes with closed
boundaries. This representation is effective in capturing both local and global characteristics of a shape, invariant to translation, rotation, scaling and shear, and robust against noise and moderate amounts of occlusion. For matching, two algorithms are introduced.
The first algorithm matches concavity maxima points extracted from TAR image obtained by thresholding the TAR. In the second matching algorithm, dynamic space warping (DSW) is employed to search efficiently for the optimal (least cost) correspondence between the points of two shapes. Experimental results using the MPEG-7 CE-1 database of 1400 shapes show the superiority of our method over other recent methods.
Then, a geometry-based image retrieval system is developed for multi-object images. We model both shape and topology of image objects including holes using a structured representation called curvature tree (CT). To facilitate shape-based matching, the TAR of each object and hole is stored at the corresponding node in the CT. The similarity between two CTs is measured based on the maximum similarity subtree isomorphism (MSSI) where a one-to-one correspondence is established between the nodes of the two trees. Our matching scheme agrees with many recent findings in psychology about the human perception of multi-object images. Two algorithms are introduced to solve the MSSI problem: an approximate and an exact. Both algorithms have polynomial-time computational complexity and use the DSW as the similarity measure between the attributed
nodes. Experiments on a database of 13500 medical images and a database of 1580 logo images have shown the effectiveness of the proposed method.
The purpose of the last part is to allow for high-level shape retrieval in multi-object images by detecting and extracting the envelope of high-level object groupings in the image. Motivated by studies in Gestalt theory, a new algorithm for the envelope extraction is proposed that works in two stages. The first stage detects the envelope (if exists) and groups its objects using hierarchical clustering. In the second stage, each grouping is merged using morphological operations and then further refined using concavity tree reconstruction to eliminate odd concavities in the extracted envelope. Experiment on a set of 110 logo images demonstrates the feasibility of our approach
- …