6,525 research outputs found

    Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS

    Full text link
    Many modern video processing pipelines rely on edge-aware (EA) filtering methods. However, recent high-quality methods are challenging to run in real-time on embedded hardware due to their computational load. To this end, we propose an area-efficient and real-time capable hardware implementation of a high quality EA method. In particular, we focus on the recently proposed permeability filter (PF) that delivers promising quality and performance in the domains of HDR tone mapping, disparity and optical flow estimation. We present an efficient hardware accelerator that implements a tiled variant of the PF with low on-chip memory requirements and a significantly reduced external memory bandwidth (6.4x w.r.t. the non-tiled PF). The design has been taped out in 65 nm CMOS technology, is able to filter 720p grayscale video at 24.8 Hz and achieves a high compute density of 6.7 GFLOPS/mm2 (12x higher than embedded GPUs when scaled to the same technology node). The low area and bandwidth requirements make the accelerator highly suitable for integration into SoCs where silicon area budget is constrained and external memory is typically a heavily contended resource

    Self-organized learning in multi-layer networks

    Get PDF
    We present a framework for the self-organized formation of high level learning by a statistical preprocessing of features. The paper focuses first on the formation of the features in the context of layers of feature processing units as a kind of resource-restricted associative multiresolution learning We clame that such an architecture must reach maturity by basic statistical proportions, optimizing the information processing capabilities of each layer. The final symbolic output is learned by pure association of features of different levels and kind of sensorial input. Finally, we also show that common error-correction learning for motor skills can be accomplished also by non-specific associative learning. Keywords: feedforward network layers, maximal information gain, restricted Hebbian learning, cellular neural nets, evolutionary associative learnin

    Mixed-signal CNN array chips for image processing

    Get PDF
    Due to their local connectivity and wide functional capabilities, cellular nonlinear networks (CNN) are excellent candidates for the implementation of image processing algorithms using VLSI analog parallel arrays. However, the design of general purpose, programmable CNN chips with dimensions required for practical applications raises many challenging problems to analog designers. This is basically due to the fact that large silicon area means large development cost, large spatial deviations of design parameters and low production yield. CNN designers must face different issues to keep reasonable enough accuracy level and production yield together with reasonably low development cost in their design of large CNN chips. This paper outlines some of these major issues and their solutions

    SIRENA: A CAD environment for behavioural modelling and simulation of VLSI cellular neural network chips

    Get PDF
    This paper presents SIRENA, a CAD environment for the simulation and modelling of mixed-signal VLSI parallel processing chips based on cellular neural networks. SIRENA includes capabilities for: (a) the description of nominal and non-ideal operation of CNN analogue circuitry at the behavioural level; (b) performing realistic simulations of the transient evolution of physical CNNs including deviations due to second-order effects of the hardware; and, (c) evaluating sensitivity figures, and realize noise and Monte Carlo simulations in the time domain. These capabilities portray SIRENA as better suited for CNN chip development than algorithmic simulation packages (such as OpenSimulator, Sesame) or conventional neural networks simulators (RCS, GENESIS, SFINX), which are not oriented to the evaluation of hardware non-idealities. As compared to conventional electrical simulators (such as HSPICE or ELDO-FAS), SIRENA provides easier modelling of the hardware parasitics, a significant reduction in computation time, and similar accuracy levels. Consequently, iteration during the design procedure becomes possible, supporting decision making regarding design strategies and dimensioning. SIRENA has been developed using object-oriented programming techniques in C, and currently runs under the UNIX operating system and X-Windows framework. It employs a dedicated high-level hardware description language: DECEL, fitted to the description of non-idealities arising in CNN hardware. This language has been developed aiming generality, in the sense of making no restrictions on the network models that can be implemented. SIRENA is highly modular and composed of independent tools. This simplifies future expansions and improvements.Comisión Interministerial de Ciencia y Tecnología TIC96-1392-C02-0

    Real-Time Restoration of Images Degraded by Uniform Motion Blur in Foveal Active Vision Systems

    Full text link
    Foveated, log-polar, or space-variant image architectures provide a high resolution and wide field workspace, while providing a small pixel computation load. These characteristics are ideal for mobile robotic and active vision applications. Recently we have described a generalization of the Fourier Transform (the fast exponential chirp transform) which allows frame-rate computation of full-field 2D frequency transforms on a log-polar image format. In the present work, we use Wiener filtering, performed using the Exponential Chirp Transform, on log-polar (fovcated) image formats to de-blur images which have been degraded by uniform camera motion.Defense Advanced Research Projects Agency and Office of Naval Research (N00014-96-C-0178); Office of Naval Research Multidisciplinary University Research Initiative (N00014-95-1-0409

    Neuro-memristive Circuits for Edge Computing: A review

    Full text link
    The volume, veracity, variability, and velocity of data produced from the ever-increasing network of sensors connected to Internet pose challenges for power management, scalability, and sustainability of cloud computing infrastructure. Increasing the data processing capability of edge computing devices at lower power requirements can reduce several overheads for cloud computing solutions. This paper provides the review of neuromorphic CMOS-memristive architectures that can be integrated into edge computing devices. We discuss why the neuromorphic architectures are useful for edge devices and show the advantages, drawbacks and open problems in the field of neuro-memristive circuits for edge computing

    Image-based quantitative analysis of gold immunochromatographic strip via cellular neural network approach

    Get PDF
    "(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works."Gold immunochromatographic strip assay provides a rapid, simple, single-copy and on-site way to detect the presence or absence of the target analyte. This paper aims to develop a method for accurately segmenting the test line and control line of the gold immunochromatographic strip (GICS) image for quantitatively determining the trace concentrations in the specimen, which can lead to more functional information than the traditional qualitative or semi-quantitative strip assay. The canny operator as well as the mathematical morphology method is used to detect and extract the GICS reading-window. Then, the test line and control line of the GICS reading-window are segmented by the cellular neural network (CNN) algorithm, where the template parameters of the CNN are designed by the switching particle swarm optimization (SPSO) algorithm for improving the performance of the CNN. It is shown that the SPSO-based CNN offers a robust method for accurately segmenting the test and control lines, and therefore serves as a novel image methodology for the interpretation of GICS. Furthermore, quantitative comparison is carried out among four algorithms in terms of the peak signal-to-noise ratio. It is concluded that the proposed CNN algorithm gives higher accuracy and the CNN is capable of parallelism and analog very-large-scale integration implementation within a remarkably efficient time
    corecore