11,955 research outputs found

    Extending systems-on-chip to the third dimension : performance, cost and technological tradeoffs.

    Get PDF
    Because of the today's market demand for high-performance, high-density portable hand-held applications, electronic system design technology has shifted the focus from 2-D planar SoC single-chip solutions to different alternative options as tiled silicon and single-level embedded modules as well as 3-D integration. Among the various choices, finding an optimal solution for system implementation dealt usually with cost, performance and other technological trade-off analysis at the system conceptual level. It has been identified that the decisions made within the first 20% of the total design cycle time will ultimately result up to 80% of the final product cost. In this paper, we discuss appropriate and realistic metric for performance and cost trade-off analysis both at system conceptual level (up-front in the design phase) and at implementation phase for verification in the three-dimensional integration. In order to validate the methodology, two ubiquitous electronic systems are analyzed under various implementation schemes and discuss the pros and cons of each of them

    An ART1 microchip and its use in multi-ART1 systems

    Get PDF
    Recently, a real-time clustering microchip neural engine based on the ART1 architecture has been reported. Such chip is able to cluster 100-b patterns into up to 18 categories at a speed of 1.8 μs per pattern. However, that chip rendered an extremely high silicon area consumption of 1 cm2, and consequently an extremely low yield of 6%. Redundant circuit techniques can be introduced to improve yield performance at the cost of further increasing chip size. In this paper we present an improved ART1 chip prototype based on a different approach to implement the most area consuming circuit elements of the first prototype: an array of several thousand current sources which have to match within a precision of around 1%. Such achievement was possible after a careful transistor mismatch characterization of the fabrication process (ES2-1.0 μm CMOS). A new prototype chip has been fabricated which can cluster 50-b input patterns into up to ten categories. The chip has 15 times less area, shows a yield performance of 98%, and presents the same precision and speed than the previous prototype. Due to its higher robustness multichip systems are easily assembled. As a demonstration we show results of a two-chip ART1 system, and of an ARTMAP system made of two ART1 chips and an extra interfacing chip

    Analog VLSI-Based Modeling of the Primate Oculomotor System

    Get PDF
    One way to understand a neurobiological system is by building a simulacrum that replicates its behavior in real time using similar constraints. Analog very large-scale integrated (VLSI) electronic circuit technology provides such an enabling technology. We here describe a neuromorphic system that is part of a long-term effort to understand the primate oculomotor system. It requires both fast sensory processing and fast motor control to interact with the world. A one-dimensional hardware model of the primate eye has been built that simulates the physical dynamics of the biological system. It is driven by two different analog VLSI chips, one mimicking cortical visual processing for target selection and tracking and another modeling brain stem circuits that drive the eye muscles. Our oculomotor plant demonstrates both smooth pursuit movements, driven by a retinal velocity error signal, and saccadic eye movements, controlled by retinal position error, and can reproduce several behavioral, stimulation, lesion, and adaptation experiments performed on primates

    Accurate a priori signal integrity estimation using a multilevel dynamic interconnect model for deep submicron VLSI design.

    Get PDF
    A multilevel dynamic interconnect model was derived for accurate a priori signal integrity estimates. Cross-talk and delay estimations over interconnects in deep submicron technology were analyzed systematically using this model. Good accuracy and excellent time-efficiency were found compared with electromagnetic simulations. We aim to build a dynamic interconnect library with this model to facilitate the interconnect issues for future VLSI design

    Memristor MOS Content Addressable Memory (MCAM): Hybrid Architecture for Future High Performance Search Engines

    Full text link
    Large-capacity Content Addressable Memory (CAM) is a key element in a wide variety of applications. The inevitable complexities of scaling MOS transistors introduce a major challenge in the realization of such systems. Convergence of disparate technologies, which are compatible with CMOS processing, may allow extension of Moore's Law for a few more years. This paper provides a new approach towards the design and modeling of Memristor (Memory resistor) based Content Addressable Memory (MCAM) using a combination of memristor MOS devices to form the core of a memory/compare logic cell that forms the building block of the CAM architecture. The non-volatile characteristic and the nanoscale geometry together with compatibility of the memristor with CMOS processing technology increases the packing density, provides for new approaches towards power management through disabling CAM blocks without loss of stored data, reduces power dissipation, and has scope for speed improvement as the technology matures.Comment: 10 pages, 11 figure

    Hardware-Amenable Structural Learning for Spike-based Pattern Classification using a Simple Model of Active Dendrites

    Full text link
    This paper presents a spike-based model which employs neurons with functionally distinct dendritic compartments for classifying high dimensional binary patterns. The synaptic inputs arriving on each dendritic subunit are nonlinearly processed before being linearly integrated at the soma, giving the neuron a capacity to perform a large number of input-output mappings. The model utilizes sparse synaptic connectivity; where each synapse takes a binary value. The optimal connection pattern of a neuron is learned by using a simple hardware-friendly, margin enhancing learning algorithm inspired by the mechanism of structural plasticity in biological neurons. The learning algorithm groups correlated synaptic inputs on the same dendritic branch. Since the learning results in modified connection patterns, it can be incorporated into current event-based neuromorphic systems with little overhead. This work also presents a branch-specific spike-based version of this structural plasticity rule. The proposed model is evaluated on benchmark binary classification problems and its performance is compared against that achieved using Support Vector Machine (SVM) and Extreme Learning Machine (ELM) techniques. Our proposed method attains comparable performance while utilizing 10 to 50% less computational resources than the other reported techniques.Comment: Accepted for publication in Neural Computatio

    Photonics design tool for advanced CMOS nodes

    Full text link
    Recently, the authors have demonstrated large-scale integrated systems with several million transistors and hundreds of photonic elements. Yielding such large-scale integrated systems requires a design-for-manufacture rigour that is embodied in the 10 000 to 50 000 design rules that these designs must comply within advanced complementary metal-oxide semiconductor manufacturing. Here, the authors present a photonic design automation tool which allows automatic generation of layouts without design-rule violations. This tool is written in SKILL, the native language of the mainstream electric design automation software, Cadence. This allows seamless integration of photonic and electronic design in a single environment. The tool leverages intuitive photonic layer definitions, allowing the designer to focus on the physical properties rather than on technology-dependent details. For the first time the authors present an algorithm for removal of design-rule violations from photonic layouts based on Manhattan discretisation, Boolean and sizing operations. This algorithm is not limited to the implementation in SKILL, and can in principle be implemented in any scripting language. Connectivity is achieved with software-defined waveguide ports and low-level procedures that enable auto-routing of waveguide connections.Comment: 5 pages, 10 figure

    A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM

    No full text
    In this article, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in the OFDM based IEEE 802.11a Wireless LAN (WLAN) baseband processor. The 64-point FFT is realized by decomposing it into a 2-D structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use any 2-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25 ?m BiCMOS technology. The core area of this chip is 6.8 mm2. The average dynamic power consumption is 41 mW @ 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i. e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption
    corecore