207,890 research outputs found

    Challenges in mixed-signal IC design of CNN chips in submicron CMOS

    Get PDF
    Summary form only given. The contrast observed between the performance of artificial vision machines and "natural" vision system is due to the inherent parallelism of the former. In particular, the retina combines image sensing and parallel processing to reduce the amount of data transmitted for subsequent processing by the following stages of the human vision system. Industrial applications demand CMOS vision chips capable of flexible operation, with programmable features and standard interfacing to conventional equipment. The CNN Universal Machine (CNN-UM) is a powerful methodological framework for the systematic development of these chips. Basic system-level targets in the design of these chips are to increase the cell density and operation speed. As the technology scales down to submicron all the lateral dimensions decrease by the scaling factor /spl lambda/, and the vertical dimensions scale as /spl lambda//sup -a/, where a is typically around 1/2. Ideally, cell density /spl prop//spl lambda//sup 2/ and time constant /spl prop//spl lambda//sup -2/. The article explains why this is not strictly true, and addresses the challenges involved in the design of CNN chips in submicron technologies.Comisión Interministerial de Ciencia y Tecnología TIC96-1392-C02-0

    Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

    Full text link
    Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness

    A massively parallel SIMD processor for neural network and machine vision applications

    Get PDF
    This thesis describes the MM32k, a massively parallel SIMD computer which is easy to program, high in performance, low in cost and effective for implementing highly parallel neural network architectures. The MM32k has 32768 bit serial processing elements, each of which has 512 bits of memory, and all of which are interconnected by a switching network. The entire system resides on a single PC-AT compatible card. It is programmed from the host computer using a C++ language class library which supports variable precision vector arithmetic. The MM32k also supports direct video input and output for machine vision applications

    A modified ART 1 algorithm more suitable for VLSI implementations

    Get PDF
    This paper presents a modification to the original ART 1 algorithm (Carpenter and Grossberg, 1987a, A massively parallel architecture for a self-organizing neural pattern recognition machine, Computer Vision, Graphics, and Image Processing, 37, 54–115) that is conceptually similar, can be implemented in hardware with less sophisticated building blocks, and maintains the computational capabilities of the originally proposed algorithm. This modified ART 1 algorithm (which we will call here ART 1m) is the result of hardware motivated simplifications investigated during the design of an actual ART 1 chip [Serrano-Gotarredona et al., 1994, Proc. 1994 IEEE Int. Conf. Neural Networks (Vol. 3, pp. 1912–1916); Serrano-Gotarredona and Linares-Barranco, 1996, IEEE Trans. VLSI Systems, (in press)]. The purpose of this paper is simply to justify theoretically that the modified algorithm preserves the computational properties of the original one and to study the difference in behavior between the two approaches

    Three Highly Parallel Computer Architectures and Their Suitability for Three Representative Artificial Intelligence Problems

    Get PDF
    Virtually all current Artificial Intelligence (AI) applications are designed to run on sequential (von Neumann) computer architectures. As a result, current systems do not scale up. As knowledge is added to these systems, a point is reached where their performance quickly degrades. The performance of a von Neumann machine is limited by the bandwidth between memory and processor (the von Neumann bottleneck). The bottleneck is avoided by distributing the processing power across the memory of the computer. In this scheme the memory becomes the processor (a smart memory ). This paper highlights the relationship between three representative AI application domains, namely knowledge representation, rule-based expert systems, and vision, and their parallel hardware realizations. Three machines, covering a wide range of fundamental properties of parallel processors, namely module granularity, concurrency control, and communication geometry, are reviewed: the Connection Machine (a fine-grained SIMD hypercube), DADO (a medium-grained MIMD/SIMD/MSIMD tree-machine), and the Butterfly (a coarse-grained MIMD Butterflyswitch machine)

    FPGA Implementation of An Event-driven Saliency-based Selective Attention Model

    Full text link
    Artificial vision systems of autonomous agents face very difficult challenges, as their vision sensors are required to transmit vast amounts of information to the processing stages, and to process it in real-time. One first approach to reduce data transmission is to use event-based vision sensors, whose pixels produce events only when there are changes in the input. However, even for event-based vision, transmission and processing of visual data can be quite onerous. Currently, these challenges are solved by using high-speed communication links and powerful machine vision processing hardware. But if resources are limited, instead of processing all the sensory information in parallel, an effective strategy is to divide the visual field into several small sub-regions, choose the region of highest saliency, process it, and shift serially the focus of attention to regions of decreasing saliency. This strategy, commonly used also by the visual system of many animals, is typically referred to as ``selective attention''. Here we present a digital architecture implementing a saliency-based selective visual attention model for processing asynchronous event-based sensory information received from a DVS. For ease of prototyping, we use a standard digital design flow and map the architecture on an FPGA. We describe the architecture block diagram highlighting the efficient use of the available hardware resources demonstrated through experimental results exploiting a hardware setup where the FPGA interfaced with the DVS camera.Comment: 5 pages, 5 figure

    The SP theory of intelligence: benefits and applications

    Full text link
    This article describes existing and expected benefits of the "SP theory of intelligence", and some potential applications. The theory aims to simplify and integrate ideas across artificial intelligence, mainstream computing, and human perception and cognition, with information compression as a unifying theme. It combines conceptual simplicity with descriptive and explanatory power across several areas of computing and cognition. In the "SP machine" -- an expression of the SP theory which is currently realized in the form of a computer model -- there is potential for an overall simplification of computing systems, including software. The SP theory promises deeper insights and better solutions in several areas of application including, most notably, unsupervised learning, natural language processing, autonomous robots, computer vision, intelligent databases, software engineering, information compression, medical diagnosis and big data. There is also potential in areas such as the semantic web, bioinformatics, structuring of documents, the detection of computer viruses, data fusion, new kinds of computer, and the development of scientific theories. The theory promises seamless integration of structures and functions within and between different areas of application. The potential value, worldwide, of these benefits and applications is at least $190 billion each year. Further development would be facilitated by the creation of a high-parallel, open-source version of the SP machine, available to researchers everywhere.Comment: arXiv admin note: substantial text overlap with arXiv:1212.022

    Object Recognition Using Convolutional Neural Networks

    Get PDF
    This chapter intends to present the main techniques for detecting objects within images. In recent years there have been remarkable advances in areas such as machine learning and pattern recognition, both using convolutional neural networks (CNNs). It is mainly due to the increased parallel processing power provided by graphics processing units (GPUs). In this chapter, the reader will understand the details of the state-of-the-art algorithms for object detection in images, namely, faster region convolutional neural network (Faster RCNN), you only look once (YOLO), and single shot multibox detector (SSD). We will present the advantages and disadvantages of each technique from a series of comparative tests. For this, we will use metrics such as accuracy, training difficulty, and characteristics to implement the algorithms. In this chapter, we intend to contribute to a better understanding of the state of the art in machine learning and convolutional networks for solving problems involving computational vision and object detection
    corecore