129 research outputs found

    Review of CMOS implementations of the CNN universal machine-type visual microprocessors

    Get PDF
    While in most application areas digital processors can solve problems initially, in some fields their capabilities are very limited. A typical example is vision. Simple animals outperform super-computers in the realization of basic vision tasks. In order to overcome the limitations of these conventional systems, a fundamentally different array architecture is needed. This architecture is based on the new paradigm of analogic cellular (CNN) computing whose most advanced implementation is the so-called CNN universal machine (CNN-UM). Its main components are: a) parallel architecture consisting of an array of locally-connected analog processors; b) a means of storing, locally, pixel-by-pixel, the intermediate computation results, and c) stored on-chip programmability. When implemented as a mixed-signal VLSI chip, the CNN-UM is capable of image processing at rates of trillions of operations per second with very small size and low power consumption. On the other hand, when integrating the adaptive multi-sensor array in the CNN-UM, the resulting sensor+computer array offers unprecedented capabilities. This paper reviews the latest results on CMN-UM chips and systems, and outlines the envisaged roadmap for these computers.European Union IST-1999-19007Comisión Interministerial de Ciencia y Tecnología TIC99-082

    Perspectives for Monte Carlo simulations on the CNN Universal Machine

    Full text link
    Possibilities for performing stochastic simulations on the analog and fully parallelized Cellular Neural Network Universal Machine (CNN-UM) are investigated. By using a chaotic cellular automaton perturbed with the natural noise of the CNN-UM chip, a realistic binary random number generator is built. As a specific example for Monte Carlo type simulations, we use this random number generator and a CNN template to study the classical site-percolation problem on the ACE16K chip. The study reveals that the analog and parallel architecture of the CNN-UM is very appropriate for stochastic simulations on lattice models. The natural trend for increasing the number of cells and local memories on the CNN-UM chip will definitely favor in the near future the CNN-UM architecture for such problems.Comment: 14 pages, 6 figure

    Split and Shift Methodology: Overcoming Hardware Limitations on Cellular Processor Arrays for Image Processing

    Get PDF
    Na era multimedia, o procesado de imaxe converteuse nun elemento de singular importancia nos dispositivos electrónicos. Dende as comunicacións (p.e. telemedicina), a seguranza (p.e. recoñecemento retiniano) ou control de calidade e de procesos industriais (p.e. orientación de brazos articulados, detección de defectos do produto), pasando pola investigación (p.e. seguimento de partículas elementais) e diagnose médica (p.e. detección de células estrañas, identificaciónn de veas retinianas), hai un sinfín de aplicacións onde o tratamento e interpretación automáticas de imaxe e fundamental. O obxectivo último será o deseño de sistemas de visión con capacidade de decisión. As tendencias actuais requiren, ademais, a combinación destas capacidades en dispositivos pequenos e portátiles con resposta en tempo real. Isto propón novos desafíos tanto no deseño hardware como software para o procesado de imaxe, buscando novas estruturas ou arquitecturas coa menor area e consumo de enerxía posibles sen comprometer a funcionalidade e o rendemento

    Aspects of algorithms and dynamics of cellular paradigms

    Get PDF
    Els paradigmes cel·lulars, com les xarxes neuronals cel·lulars (CNN, en anglès) i els autòmats cel·lulars (CA, en anglès), són una eina excel·lent de càlcul, al ser equivalents a una màquina universal de Turing. La introducció de la màquina universal CNN (CNN-UM, en anglès) ha permès desenvolupar hardware, el nucli computacional del qual funciona segons la filosofia cel·lular; aquest hardware ha trobat aplicació en diversos camps al llarg de la darrera dècada. Malgrat això, encara hi ha moltes preguntes a obertes sobre com definir els algoritmes d'una CNN-UM i com estudiar la dinàmica dels autòmats cel·lulars. En aquesta tesis es tracten els dos problemes: primer, es demostra que es possible acotar l'espai dels algoritmes per a la CNN-UM i explorar-lo gràcies a les tècniques genètiques; i segon, s'expliquen els fonaments de l'estudi dels CA per mitjà de la dinàmica no lineal (segons la definició de Chua) i s'il·lustra com aquesta tècnica ha permès trobar resultats innovadors.Los paradigmas celulares, como las redes neuronales celulares (CNN, eninglés) y los autómatas celulares (CA, en inglés), son una excelenteherramienta de cálculo, al ser equivalentes a una maquina universal deTuring. La introducción de la maquina universal CNN (CNN-UM, eninglés) ha permitido desarrollar hardware cuyo núcleo computacionalfunciona según la filosofía celular; dicho hardware ha encontradoaplicación en varios campos a lo largo de la ultima década. Sinembargo, hay aun muchas preguntas abiertas sobre como definir losalgoritmos de una CNN-UM y como estudiar la dinámica de los autómatascelular. En esta tesis se tratan ambos problemas: primero se demuestraque es posible acotar el espacio de los algoritmos para la CNN-UM yexplorarlo gracias a técnicas genéticas; segundo, se explican losfundamentos del estudio de los CA por medio de la dinámica no lineal(según la definición de Chua) y se ilustra como esta técnica hapermitido encontrar resultados novedosos.Cellular paradigms, like Cellular Neural Networks (CNNs) and Cellular Automata (CA) are an excellent tool to perform computation, since they are equivalent to a Universal Turing machine. The introduction of the Cellular Neural Network - Universal Machine (CNN-UM) allowed us to develop hardware whose computational core works according to the principles of cellular paradigms; such a hardware has found application in a number of fields throughout the last decade. Nevertheless, there are still many open questions about how to define algorithms for a CNN-UM, and how to study the dynamics of Cellular Automata. In this dissertation both problems are tackled: first, we prove that it is possible to bound the space of all algorithms of CNN-UM and explore it through genetic techniques; second, we explain the fundamentals of the nonlinear perspective of CA (according to Chua's definition), and we illustrate how this technique has allowed us to find novel results

    Dynamically reconfigurable architecture for embedded computer vision systems

    Get PDF
    The objective of this research work is to design, develop and implement a new architecture which integrates on the same chip all the processing levels of a complete Computer Vision system, so that the execution is efficient without compromising the power consumption while keeping a reduced cost. For this purpose, an analysis and classification of different mathematical operations and algorithms commonly used in Computer Vision are carried out, as well as a in-depth review of the image processing capabilities of current-generation hardware devices. This permits to determine the requirements and the key aspects for an efficient architecture. A representative set of algorithms is employed as benchmark to evaluate the proposed architecture, which is implemented on an FPGA-based system-on-chip. Finally, the prototype is compared to other related approaches in order to determine its advantages and weaknesses

    A Decade of Neural Networks: Practical Applications and Prospects

    Get PDF
    The Jet Propulsion Laboratory Neural Network Workshop, sponsored by NASA and DOD, brings together sponsoring agencies, active researchers, and the user community to formulate a vision for the next decade of neural network research and application prospects. While the speed and computing power of microprocessors continue to grow at an ever-increasing pace, the demand to intelligently and adaptively deal with the complex, fuzzy, and often ill-defined world around us remains to a large extent unaddressed. Powerful, highly parallel computing paradigms such as neural networks promise to have a major impact in addressing these needs. Papers in the workshop proceedings highlight benefits of neural networks in real-world applications compared to conventional computing techniques. Topics include fault diagnosis, pattern recognition, and multiparameter optimization

    An Optoelectronic Stimulator for Retinal Prosthesis

    No full text
    Retinal prostheses require the presence of viable population of cells in the inner retina. Evaluations of retina with Age-Related Macular Degeneration (AMD) and Retinitis Pigmentosa (RP) have shown a large number of cells remain in the inner retina compared with the outer retina. Therefore, vision loss caused by AMD and RP is potentially treatable with retinal prostheses. Photostimulation based retinal prostheses have shown many advantages compared with retinal implants. In contrary to electrode based stimulation, light does not require mechanical contact. Therefore, the system can be completely external and not does have the power and degradation problems of implanted devices. In addition, the stimulating point is flexible and does not require a prior decision on the stimulation location. Furthermore, a beam of light can be projected on tissue with both temporal and spatial precision. This thesis aims at fi nding a feasible solution to such a system. Firstly, a prototype of an optoelectronic stimulator was proposed and implemented by using the Xilinx Virtex-4 FPGA evaluation board. The platform was used to demonstrate the possibility of photostimulation of the photosensitized neurons. Meanwhile, with the aim of developing a portable retinal prosthesis, a system on chip (SoC) architecture was proposed and a wide tuning range sinusoidal voltage-controlled oscillator (VCO) which is the pivotal component of the system was designed. The VCO is based on a new designed Complementary Metal Oxide Semiconductor (CMOS) Operational Transconductance Ampli er (OTA) which achieves a good linearity over a wide tuning range. Both the OTA and the VCO were fabricated in the AMS 0.35 µm CMOS process. Finally a 9X9 CMOS image sensor with spiking pixels was designed. Each pixel acts as an independent oscillator whose frequency is controlled by the incident light intensity. The sensor was fabricated in the AMS 0.35 µm CMOS Opto Process. Experimental validation and measured results are provided

    Efficient hardware implementations of bio-inspired networks

    Get PDF
    The human brain, with its massive computational capability and power efficiency in small form factor, continues to inspire the ultimate goal of building machines that can perform tasks without being explicitly programmed. In an effort to mimic the natural information processing paradigms observed in the brain, several neural network generations have been proposed over the years. Among the neural networks inspired by biology, second-generation Artificial or Deep Neural Networks (ANNs/DNNs) use memoryless neuron models and have shown unprecedented success surpassing humans in a wide variety of tasks. Unlike ANNs, third-generation Spiking Neural Networks (SNNs) closely mimic biological neurons by operating on discrete and sparse events in time called spikes, which are obtained by the time integration of previous inputs. Implementation of data-intensive neural network models on computers based on the von Neumann architecture is mainly limited by the continuous data transfer between the physically separated memory and processing units. Hence, non-von Neumann architectural solutions are essential for processing these memory-intensive bio-inspired neural networks in an energy-efficient manner. Among the non-von Neumann architectures, implementations employing non-volatile memory (NVM) devices are most promising due to their compact size and low operating power. However, it is non-trivial to integrate these nanoscale devices on conventional computational substrates due to their non-idealities, such as limited dynamic range, finite bit resolution, programming variability, etc. This dissertation demonstrates the architectural and algorithmic optimizations of implementing bio-inspired neural networks using emerging nanoscale devices. The first half of the dissertation focuses on the hardware acceleration of DNN implementations. A 4-layer stochastic DNN in a crossbar architecture with memristive devices at the cross point is analyzed for accelerating DNN training. This network is then used as a baseline to explore the impact of experimental memristive device behavior on network performance. Programming variability is found to have a critical role in determining network performance compared to other non-ideal characteristics of the devices. In addition, noise-resilient inference engines are demonstrated using stochastic memristive DNNs with 100 bits for stochastic encoding during inference and 10 bits for the expensive training. The second half of the dissertation focuses on a novel probabilistic framework for SNNs using the Generalized Linear Model (GLM) neurons for capturing neuronal behavior. This work demonstrates that probabilistic SNNs have comparable perform-ance against equivalent ANNs on two popular benchmarks - handwritten-digit classification and human activity recognition. Considering the potential of SNNs in energy-efficient implementations, a hardware accelerator for inference is proposed, termed as Spintronic Accelerator for Probabilistic SNNs (SpinAPS). The learning algorithm is optimized for a hardware friendly implementation and uses first-to-spike decoding scheme for low latency inference. With binary spintronic synapses and digital CMOS logic neurons for computations, SpinAPS achieves a performance improvement of 4x in terms of GSOPS/W/mm2^2 when compared to a conventional SRAM-based design. Collectively, this work demonstrates the potential of emerging memory technologies in building energy-efficient hardware architectures for deep and spiking neural networks. The design strategies adopted in this work can be extended to other spike and non-spike based systems for building embedded solutions having power/energy constraints
    corecore