60 research outputs found
Challenges in mixed-signal IC design of CNN chips in submicron CMOS
Summary form only given. The contrast observed between the performance of artificial vision machines and "natural" vision system is due to the inherent parallelism of the former. In particular, the retina combines image sensing and parallel processing to reduce the amount of data transmitted for subsequent processing by the following stages of the human vision system. Industrial applications demand CMOS vision chips capable of flexible operation, with programmable features and standard interfacing to conventional equipment. The CNN Universal Machine (CNN-UM) is a powerful methodological framework for the systematic development of these chips. Basic system-level targets in the design of these chips are to increase the cell density and operation speed. As the technology scales down to submicron all the lateral dimensions decrease by the scaling factor /spl lambda/, and the vertical dimensions scale as /spl lambda//sup -a/, where a is typically around 1/2. Ideally, cell density /spl prop//spl lambda//sup 2/ and time constant /spl prop//spl lambda//sup -2/. The article explains why this is not strictly true, and addresses the challenges involved in the design of CNN chips in submicron technologies.Comisión Interministerial de Ciencia y Tecnología TIC96-1392-C02-0
A mixed-signal early vision chip with embedded image and programming memories and digital I/O
From a system level perspective, this paper presents a 128 × 128 flexible and reconfigurable Focal-Plane Analog Programmable Array Processor, which has been designed as a single chip in a 0.35μm standard digital 1P-5M CMOS technology. The core processing array has been designed to achieve high-speed of operation and large-enough accuracy (∼ 7bit) with low power consumption. The chip includes on-chip program memory to allow for the execution of complex, sequential and/or bifurcation flow image processing algorithms. It also includes the structures and circuits needed to guarantee its embedding into conventional digital hosting systems: external data interchange and control are completely digital. The chip contains close to four million transistors, 90% of them working in analog mode. The chip features up to 330GOPs (Giga Operations per second), and uses the power supply (180GOP/Joule) and the silicon area (3.8 GOPS/mm2) efficiently, as it is able to maintain VGA processing throughputs of 100Frames/s with about 15 basic image processing tasks on each frame
A one-transistor-synapse strategy for electrically-programmable massively-parallel analog array processors
This paper presents a linear, four-quadrants, electrically-programmable, one-transistor synapse strategy applicable to the implementation of general massively-parallel analog processors in CMOS technology. It is specially suited for translationally-invariant processing arrays with local connectivity, and results in a significant reduction in area occupation and power dissipation of the basic processing units. This allows higher integration densities and therefore, permits the integration of larger arrays on a single chip.Comisión Interministerial de Ciencia y Tecnología TIC96- 1392-C02-0
CMOS optical-sensor array with high output current levels and automatic signal-range centring
A CMOS compatible photosensor with high output current levels, and an area-efficient scheme for automatic signal-range centring according to illumination conditions are presented. The high output current levels allow the use of these devices in continuoustime asynchronous imagers, as well as in high-sampling-frequency applications
Hybrid-control of synapse circuits for programmable cellular neural networks
This paper describes a hybrid weight-control strategy for VLSI realizations of programmable Cellular Neural Networks (CNNs), based on auto-tuning of analog control signals to digitally specified values. The approach merges the advantages of digital and analog programmability, achieving low areas and reduced number of control lines, simplifying the control and storage of weight values, and eliminating their dependency on global process-parameter variations
Four-quadrant one-transistor-synapse for high-density CNN implementations
Presents a linear four-quadrants, electrically-programmable, one-transistor synapse strategy applicable to the implementation of general massively-parallel analog processors in CMOS technology. It is specially suited for translationally-invariant processing arrays with local connectivity, and results in a significant reduction in area occupation and power dissipation of the basic processing units. This allows higher integration densities and therefore, permits the integration of larger arrays on a single chip.Comisión Interministerial de Ciencia y Tecnología TIC96-1392-C02-0
Weight-control strategy for programmable CNN chips
This paper describes a hybrid weight-control strategy for the VLSI realization of programmable CNNs, based on automatic adaptation of analog control signals to levels specified by digital words. This approach merges the advantages of digital and analog programmability, achieving low areas and reduced number of control lines, simplifying the control and storage of the weight values, and eliminating their dependency on global process-parameter variations
ACE16K: A 128×128 focal plane analog processor with digital I/O
This paper presents a new generation 128×128 focal-plane analog programmable array processor (FPAPAP), from a system level perspective, which has been manufactured in a 0.35 μm standard digital 1P-5M CMOS technology. The chip has been designed to achieve the high-speed and moderate-accuracy (8b) requirements of most real time early-vision processing applications. It is easily embedded in conventional digital hosting systems: external data interchange and control are completely digital. The chip contains close to four millions transistors, 90% of them working in analog mode, and exhibits a relatively low power consumption-<4 W, i.e. less than 1 μW per transistor. Computing vs. power peak values are in the order of 1 TeraOPS/W, while maintained VGA processing throughputs of 100 frames/s are possible with about 10-20 basic image processing tasks on each frame
A processing element architecture for high-density focal plane analog programmable array processors
The architecture of the elementary Processing Element - PE- used in a recently designed 128×128 Focal Plane Analog Programmable Array Processor is presented. The PE architecture contains the required building blocks to implement bifurcated data flow vision algorithms based on the execution of 3 × 3 convolution masks. The vision chip has been implemented in a standard 0.35μm CMOS technology. The main PE related figures are: 180 cells/mm2, 18 MOPS/cell, and 180 μW/cell.Office of Naval Research (USA) N68171-98-C-9004Euopean Union IST-1999-19007Comisión Interministerial de Ciencia y Tecnología TIC1 999-082
Current-Mode Techniques for the Implementation of Continuous- and Discrete-Time Cellular Neural Networks
This paper presents a unified, comprehensive approach
to the design of continuous-time (CT) and discrete-time
(DT) cellular neural networks (CNN) using CMOS current-mode
analog techniques. The net input signals are currents instead
of voltages as presented in previous approaches, thus avoiding
the need for current-to-voltage dedicated interfaces in image
processing tasks with photosensor devices. Outputs may be either
currents or voltages. Cell design relies on exploitation of current
mirror properties for the efficient implementation of both linear
and nonlinear analog operators. These cells are simpler and
easier to design than those found in previously reported CT
and DT-CNN devices. Basic design issues are covered, together
with discussions on the influence of nonidealities and advanced
circuit design issues as well as design for manufacturability
considerations associated with statistical analysis. Three prototypes
have been designed for l.6-pm n-well CMOS technologies.
One is discrete-time and can be reconfigured via local logic for
noise removal, feature extraction (borders and edges), shadow
detection, hole filling, and connected component detection (CCD)
on a rectangular grid with unity neighborhood radius. The other
two prototypes are continuous-time and fixed template: one for
CCD and other for noise removal. Experimental results are given
illustrating performance of these prototypes
- …