RISE: A Robust Image Search Engine

Abstract

In this article we address the problem of organizing images for effective and efficient retrieval in large image database systems. Specifically, we describe the design and architecture of RISE, a Robust Image Search Engine. RISE is designed to build and search an image repository, with an interface that allows for the query and maintenance of the database over the Internet using any browser. RISE is built on the foundation of a CBIR (Content Based Image Retrieval) system and computes the similarity of images using their color signatures. The signature of an image in the database is computed by systematically dividing the image into a set of small blocks of pixels and then computing the average color of each block. This is based on the Discrete Cosine Transform (DCT) that forms the basis for popular JPEG image file format. The average color in each pixel block forms the characters of our image description. Organizing these pixel blocks into a tree structure allows us to create the words or tokens for the image. Thus the tokens represent the spatial distribution of the color in the image. The tokens for each image in the database are first computed and stored in a relational database as their signatures. Using a commercial relational database system (RDBMS) to store and query signatures of images improves the efficiency of the system. A query image provided by a user is first parsed to build the tokens which are then compared with the tokens for images in the database. During the query process, tokenization improves the efficiency by quantifying the degree of match between the query image and images in the database. The content similarity is measured by computing normalized Euclidean distance between corresponding tokens in query and stored images where correspondence is defined by the relative location of those tokens. The location of pixel blocks is maintained by using a quad tree structure that also improves performance by early pruning of search space. The distance is computed in perceptual color space, specifically L * a * b * and at different levels of detail. The perceptual color space allows RISE to ignore small variations in color while different levels of detail allow it to select a set of images for further exploration, or discard a set altogether. RISE only compares the precomputed color signature images that are stored in an RDBMS. It is very efficient since there is no need to extract complete information for every image. RISE is implemented using object-oriented design techniques and is deployed as a web browser-based search engine. RISE has a GUI (Graphical User Interface) front-end and a Java servlet in the back-end that searches the images stored in the database and returns the results to the web browser. RISE enhances the performance of image operations of the system by using JAI (Java Advance Imaging) tools, which obviates the dependence on a single image file format. In addition, the use of RDBMS and Java also facilitates the portability of 1 2 Goswami, Bhatia, Samal the system

    Similar works