60 research outputs found

    PPMExplorer: Using Information Retrieval, Computer Vision and Transfer Learning Methods to Index and Explore Images of Pompeii

    Get PDF
    In this dissertation, we present and analyze the technology used in the making of PPMExplorer: Search, Find, and Explore Pompeii. PPMExplorer is a software tool made with data extracted from the Pompei: Pitture e Mosaic (PPM) volumes. PPM is a valuable set of volumes containing 20,000 historical annotated images of the archaeological site of Pompeii, Italy accompanied by extensive captions. We transformed the volumes from paper, to digital, to searchable. PPMExplorer enables archaeologist researchers to conduct and check hypotheses on historical findings. We present a theory that such a concept is possible by leveraging computer generated correlations between artifacts using image data, text data, and a combination of both. The acquisition and interconnection of the data are proposed and executed using image processing, natural language processing, data mining, and machine learning methods

    Computer analysis of composite documents with non-uniform background.

    Get PDF
    The motivation behind most of the applications of off-line text recognition is to convert data from conventional media into electronic media. Such applications are bank cheques, security documents and form processing. In this dissertation a document analysis system is presented to transfer gray level composite documents with complex backgrounds and poor illumination into electronic format that is suitable for efficient storage, retrieval and interpretation. The preprocessing stage for the document analysis system requires the conversion of a paper-based document to a digital bit-map representation after optical scanning followed by techniques of thresholding, skew detection, page segmentation and Optical Character Recognition (OCR). The system as a whole operates in a pipeline fashion where each stage or process passes its output to the next stage. The success of each stage guarantees that the operation of the system as a whole with no failures that may reduce the character recognition rate. By designing this document analysis system a new local bi-level threshold selection technique was developed for gray level composite document images with non-uniform background. The algorithm uses statistical and textural feature measures to obtain a feature vector for each pixel from a window of size (2 n + 1) x (2n + 1), where n ≥ 1. These features provide a local understanding of pixels from their neighbourhoods making it easier to classify each pixel into its proper class. A Multi-Layer Perceptron Neural Network is then used to classify each pixel value in the image. The results of thresholding are then passed to the block segmentation stage. The block segmentation technique developed is a feature-based method that uses a Neural Network classifier to automatically segment and classify the image contents into text and halftone images. Finally, the text blocks are passed into a Character Recognition (CR) system to transfer characters into an editable text format and the recognition results were compared to those obtained from a commercial OCR. The OCR system implemented uses pixel distribution as features extracted from different zones of the characters. A correlation classifier is used to recognize the characters. For the application of cheque processing, this system was used to read the special numerals of the optical barcode found in bank cheques. The OCR system uses a fuzzy descriptive feature extraction method with a correlation classifier to recognize these special numerals, which identify the bank institute and provides personal information about the account holder. The new local thresholding scheme was tested on a variety of composite document images with complex backgrounds. The results were very good compared to the results from commercial OCR software. This proposed thresholding technique is not limited to a specific application. It can be used on a variety of document images with complex backgrounds and can be implemented in any document analysis system provided that sufficient training is performed.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A445. Source: Dissertation Abstracts International, Volume: 66-02, Section: B, page: 1061. Advisers: Maher Sid-Ahmed; Majid Ahmadi. Thesis (Ph.D.)--University of Windsor (Canada), 2004

    An analytical study on image databases

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 87-88).by Francine Ming Fang.M.Eng

    Proceedings of the 2nd European conference on disability, virtual reality and associated technologies (ECDVRAT 1998)

    Get PDF
    The proceedings of the conferenc

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Data Hiding and Its Applications

    Get PDF
    Data hiding techniques have been widely used to provide copyright protection, data integrity, covert communication, non-repudiation, and authentication, among other applications. In the context of the increased dissemination and distribution of multimedia content over the internet, data hiding methods, such as digital watermarking and steganography, are becoming increasingly relevant in providing multimedia security. The goal of this book is to focus on the improvement of data hiding algorithms and their different applications (both traditional and emerging), bringing together researchers and practitioners from different research fields, including data hiding, signal processing, cryptography, and information theory, among others

    Perceptual data mining : bootstrapping visual intelligence from tracking behavior

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 161-166).One common characteristic of all intelligent life is continuous perceptual input. A decade ago, simply recording and storing a a few minutes of full frame-rate NTSC video required special hardware. Today, an inexpensive personal computer can process video in real-time tracking and recording information about multiple objects for extended periods of time, which fundamentally enables this research. This thesis is about Perceptual Data Mining (PDM), the primary goal of which is to create a real-time, autonomous perception system that can be introduced into a wide variety of environments and, through experience, learn to model the activity in that environment. The PDM framework infers as much as possible about the presence, type, identity, location, appearance, and activity of each active object in an environment from multiple video sources, without explicit supervision. PDM is a bottom-up, data-driven approach that is built on a novel, robust attention mechanism that reliably detects moving objects in a wide variety of environments. A correspondence system tracks objects through time and across multiple sensors producing sets of observations of objects that correspond to the same object in extended environments. Using a co-occurrence modeling technique that exploits the variation exhibited by objects as they move through the environment, the types of objects, the activities that objects perform, and the appearance of specific classes of objects are modeled. Different applications of this technique are demonstrated along with a discussion of the corresponding issues.(cont.) Given the resulting rich description of the active objects in the environment, it is possible to model temporal patterns. An effective method for modeling periodic cycles of activity is demonstrated in multiple environments. This framework can learn to concisely describe regularities of the activity in an environment as well as determine atypical observations. Though this is accomplished without any supervision, the introduction of a minimal amount of user interaction could be used to produce complex, task-specific perception systems.by Christopher P. Stauffer.Ph.D

    Seeing, sensing, and selection: modeling visual perception in complex environments

    Get PDF
    The purpose of this thesis is to investigate human visual perception at the level of eye movements by describing the interaction between vision and action during natural, everyday tasks in a real-world environment. The results of the investigation provide motivation for the development of a biologically-based model of selective visual perception that relies on the relative perceptual conspicuity of certain regions within the field of view. Several experiments were designed and conducted that form the basis for the model. The experiments provide evidence that the visual system is not passive, nor is it general-purpose, but rather it is active and specific, tightly coupled to the requirements of planned behavior and action. The implication for an active and task-specific visual system is that an explicit representation of the environment can be eschewed in favor of a compact representation with large potential savings in computational efficiency. The compact representation is in the form of a topographic map of relative perceptual conspicuity values. Other recent attempts at compact scene representations have focused mainly on low-level maps that code certain salient features of the scene including color, edges, and luminance. This study has found that the low-level maps do not correlate well with subjects\u27 fixation locations, therefore, a map of perceptual conspicuity is presented that incorporates high-level information. The high-level information is in the form of figure/ground segmentation, potential object detection, and task-specific location bias. The resulting model correlates well with the fixation densities of human viewers of natural scenes, and can be used as a pre-processing module for image understanding or intelligent surveillance applications
    • …
    corecore