25 research outputs found

    User-interface issues for browsing digital video

    Get PDF
    In this paper we examine a suite of systems for content-based indexing and browsing of digital video and we identify a superset of features and functions which are provided by these systems. From our classification of these we have identified that common to all is the fact of being predominantly technology-based, with little attention paid to actual user requirements. As part of our work we are developing an application for content-based browsing of digital video which will incorporate the most desirable but achievable of the functions of other systems. This will be achieved via a series of continuously refined demonstrator systems from Spring 1999 onwards which will be subjected to analysis of performance in terms of user

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Automatic detection of salient objects and spatial relations in videos for a video database system

    Get PDF
    Cataloged from PDF version of article.Multimedia databases have gained popularity due to rapidly growing quantities of multimedia data and the need to perform efficient indexing, retrieval and analysis of this data. One downside of multimedia databases is the necessity to process the data for feature extraction and labeling prior to storage and querying. Huge amount of data makes it impossible to complete this task manually. We propose a tool for the automatic detection and tracking of salient objects, and derivation of spatio-temporal relations between them in video. Our system aims to reduce the work for manual selection and labeling of objects significantly by detecting and tracking the salient objects, and hence, requiring to enter the label for each object only once within each shot instead of specifying the labels for each object in every frame they appear. This is also required as a first step in a fully-automatic video database management system in which the labeling should also be done automatically. The proposed framework covers a scalable architecture for video processing and stages of shot boundary detection, salient object detection and tracking, and knowledge-base construction for effective spatio-temporal object querying. (c) 2008 Elsevier B.V. All rights reserved

    RISE: A ROBUST IMAGE SEARCH ENGINE

    Get PDF
    This thesis advances RISE (Robust Image Search Engine), an image database application designed to build and search an image repository. rise is built on the foundation of a CBIR (Content Based Image Retrieval) system. The basic goal of this system is to compute content similarity of images based on their color signatures. The color signature of an image is computed by systematically dividing the image into a number of small blocks and computing the average color of each block using ideas from DCT (Discrete Cosine Transform) that forms the basis for JPEG (Joint Photographic Experts Group) compression format. The average color extracted from each block is used to construct a tree structure and finally, the tree structure is compared with similar structures already stored in the database. During the query process, an image is given to the system as a query image and the system returns a set of images that have similar content or color distribution as the given image. The query image is processed to create its signature which is then matched against similar signature of images already stored in the database. The content similarity is measured by computing normalized Euclidean distance between the query image and the images already stored in the database. RISE has a GUI (Graphic User Interface) front end and a Java servlet in the back end that searches the images stored in the database and returns the results to the web browser. RISE enhances the performance of image operations of the system using JAI (Java Advance Imaging) tools

    Identifying Perceptual Structures In Trademark Images

    Get PDF
    In this paper we focus on identifying image structures at different levels in figurative (trademark) images to allow higher level similarity between images to be inferred. To identify image structures at different levels, it is desirable to be able to achieve multiple views of an image at different scales and then extract perceptually-relevant shapes from the different views. The three aims of this work are: to generate multiple views of each image in a principled manner, to identify structures and shapes at different levels within images and to emulate the Gestalt principles to guide shape finding. The proposed integrated approach is able to meet all three aims

    RISE: A Robust Image Search Engine

    Get PDF
    In this article we address the problem of organizing images for effective and efficient retrieval in large image database systems. Specifically, we describe the design and architecture of RISE, a Robust Image Search Engine. RISE is designed to build and search an image repository, with an interface that allows for the query and maintenance of the database over the Internet using any browser. RISE is built on the foundation of a CBIR (Content Based Image Retrieval) system and computes the similarity of images using their color signatures. The signature of an image in the database is computed by systematically dividing the image into a set of small blocks of pixels and then computing the average color of each block. This is based on the Discrete Cosine Transform (DCT) that forms the basis for popular JPEG image file format. The average color in each pixel block forms the characters of our image description. Organizing these pixel blocks into a tree structure allows us to create the words or tokens for the image. Thus the tokens represent the spatial distribution of the color in the image. The tokens for each image in the database are first computed and stored in a relational database as their signatures. Using a commercial relational database system (RDBMS) to store and query signatures of images improves the efficiency of the system. A query image provided by a user is first parsed to build the tokens which are then compared with the tokens for images in the database. During the query process, tokenization improves the efficiency by quantifying the degree of match between the query image and images in the database. The content similarity is measured by computing normalized Euclidean distance between corresponding tokens in query and stored images where correspondence is defined by the relative location of those tokens. The location of pixel blocks is maintained by using a quad tree structure that also improves performance by early pruning of search space. The distance is computed in perceptual color space, specifically L * a * b * and at different levels of detail. The perceptual color space allows RISE to ignore small variations in color while different levels of detail allow it to select a set of images for further exploration, or discard a set altogether. RISE only compares the precomputed color signature images that are stored in an RDBMS. It is very efficient since there is no need to extract complete information for every image. RISE is implemented using object-oriented design techniques and is deployed as a web browser-based search engine. RISE has a GUI (Graphical User Interface) front-end and a Java servlet in the back-end that searches the images stored in the database and returns the results to the web browser. RISE enhances the performance of image operations of the system by using JAI (Java Advance Imaging) tools, which obviates the dependence on a single image file format. In addition, the use of RDBMS and Java also facilitates the portability of 1 2 Goswami, Bhatia, Samal the system

    An analytical study on image databases

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 87-88).by Francine Ming Fang.M.Eng

    Object detection and activity recognition in digital image and video libraries

    Get PDF
    This thesis is a comprehensive study of object-based image and video retrieval, specifically for car and human detection and activity recognition purposes. The thesis focuses on the problem of connecting low level features to high level semantics by developing relational object and activity presentations. With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools. The traditional text based query systems based on manual annotation process are impractical for today\u27s large libraries requiring an efficient information retrieval system. For this purpose, a hierarchical information retrieval system is proposed where shape, color and motion characteristics of objects of interest are captured in compressed and uncompressed domains. The proposed retrieval method provides object detection and activity recognition at different resolution levels from low complexity to low false rates. The thesis first examines extraction of low level features from images and videos using intensity, color and motion of pixels and blocks. Local consistency based on these features and geometrical characteristics of the regions is used to group object parts. The problem of managing the segmentation process is solved by a new approach that uses object based knowledge in order to group the regions according to a global consistency. A new model-based segmentation algorithm is introduced that uses a feedback from relational representation of the object. The selected unary and binary attributes are further extended for application specific algorithms. Object detection is achieved by matching the relational graphs of objects with the reference model. The major advantages of the algorithm can be summarized as improving the object extraction by reducing the dependence on the low level segmentation process and combining the boundary and region properties. The thesis then addresses the problem of object detection and activity recognition in compressed domain in order to reduce computational complexity. New algorithms for object detection and activity recognition in JPEG images and MPEG videos are developed. It is shown that significant information can be obtained from the compressed domain in order to connect to high level semantics. Since our aim is to retrieve information from images and videos compressed using standard algorithms such as JPEG and MPEG, our approach differentiates from previous compressed domain object detection techniques where the compression algorithms are governed by characteristics of object of interest to be retrieved. An algorithm is developed using the principal component analysis of MPEG motion vectors to detect the human activities; namely, walking, running, and kicking. Object detection in JPEG compressed still images and MPEG I frames is achieved by using DC-DCT coefficients of the luminance and chrominance values in the graph based object detection algorithm. The thesis finally addresses the problem of object detection in lower resolution and monochrome images. Specifically, it is demonstrated that the structural information of human silhouettes can be captured from AC-DCT coefficients
    corecore