204 research outputs found

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    Theoretical quantification of shape distortion in fuzzy hough transform

    Get PDF
    We present a generalization of classical Hough transform in fuzzy set theoretic framework (called fuzzy Hough transform or FHT) in order to handle the impreciseness/ill-definedness in shape description. In addition to identifying the shapes, the methodology can quantify the amount of distortion present in each shape by suitably characterizing the parametric space. We extended FHT to take care of gray level images (gray FHT) in order to handle the gray level variation along with shape distortion. The gray FHT gives rise to a scheme for image segmentation based on the a priori knowledge about the shapes

    Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

    Get PDF
    Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

    Framework of hierarchy for neural theory

    Get PDF

    Historical Document Analysis

    Get PDF
    Scanned documents are a rich source of various information that can be processed utilizing a document analysis system. Such a system covers the areas of Machine Learning, Computer Vision and Natural Language Processing. In the thesis, these areas are covered with a focus on common and state-of-the-art approaches applicable to historical document analysis, which is still challenging due to several difficulties such as handwritten text. Finally, the current research results and aims of the future doctoral thesis are presented

    Detecting Multilingual Lines of Text with Fusion Moves

    Get PDF
    This thesis proposes an optimization-based algorithm for detecting lines of text in images taken by hand-held cameras. The majority of existing methods for this problem assume alphabet-based texts (e.g. in Latin or Greek) and they use heuristics specific to such texts: proximity between letters within one line, larger distance between separate lines, etc. We are interested in a more challenging problem where images combine alphabet and logographic characters from multiple languages where typographic rules vary a lot (e.g. English, Korean, and Chinese). Significantly higher complexity of fitting multiple lines of text in different languages calls for an energy-based formulation combining a data fidelity term and a regularization prior. Our data cost combines geometric errors and likelihoods given by a classifier trained to low-level features in each language. Our regularization term encourages sparsity based on label costs. Our energy can be efficiently minimized by fusion moves. The algorithm was evaluated on a database of images from the subway of metropolitan area of Seoul and was proven to be robust
    corecore