Search CORE

1,446 research outputs found

Array-based architecture for FET-based, nanoscale electronics

Author: DeHon André
Publication venue
Publication date: 01/03/2003
Field of study

Advances in our basic scientific understanding at the molecular and atomic level place us on the verge of engineering designer structures with key features at the single nanometer scale. This offers us the opportunity to design computing systems at what may be the ultimate limits on device size. At this scale, we are faced with new challenges and a new cost structure which motivates different computing architectures than we found efficient and appropriate in conventional very large scale integration (VLSI). We sketch a basic architecture for nanoscale electronics based on carbon nanotubes, silicon nanowires, and nano-scale FETs. This architecture can provide universal logic functionality with all logic and signal restoration operating at the nanoscale. The key properties of this architecture are its minimalism, defect tolerance, and compatibility with emerging bottom-up nanoscale fabrication techniques. The architecture further supports micro-to-nanoscale interfacing for communication with conventional integrated circuits and bootstrap loading

Caltech Authors

Unifying mesh- and tree-based programmable interconnect

Author: DeHon André
Publication venue
Publication date: 01/10/2004
Field of study

We examine the traditional, symmetric, Manhattan mesh design for field-programmable gate-array (FPGA) routing along with tree-of-meshes (ToM) and mesh-of-trees (MoT) based designs. All three networks can provide general routing for limited bisection designs (Rent's rule with p<1) and allow locality exploitation. They differ in their detailed topology and use of hierarchy. We show that all three have the same asymptotic wiring requirements. We bound this tightly by providing constructive mappings between routes in one network and routes in another. For example, we show that a (c,p) MoT design can be mapped to a (2c,p) linear population ToM and introduce a corner turn scheme which will make it possible to perform the reverse mapping from any (c,p) linear population ToM to a (2c,p) MoT augmented with a particular set of corner turn switches. One consequence of this latter mapping is a multilayer layout strategy for N-node, linear population ToM designs that requires only /spl Theta/(N) two-dimensional area for any p when given sufficient wiring layers. We further show upper and lower bounds for global mesh routes based on recursive bisection width and show these are within a constant factor of each other and within a constant factor of MoT and ToM layout area. In the process we identify the parameters and characteristics which make the networks different, making it clear there is a unified design continuum in which these networks are simply particular regions

Caltech Authors

Deterministic Addressing of Nanoscale Devices Assembled at Sublithographic Pitches

Author: DeHon André
Publication venue
Publication date: 01/11/2005
Field of study

Multiple techniques have now been proposed using random addressing to build demultiplexers which interface between the large pitch of lithographically patterned features and the smaller pitch of self-assembled sublithographic nanowires. At the same time, the relatively high defect rates expected for molecular-sized devices and wires dictate that we design architectures with spare components so we can map around defective elements. To accommodate and mask both of these effects, we introduce a programmable addressing scheme which can be used to provide deterministic addresses for decoders built with random nanoscale addressing and potentially defective wires. We describe how this programmable addressing scheme can be implemented with emerging, nanoscale building blocks and show how to build deterministically addressable memory banks. We characterize the area required for this programmable addressing scheme. For 2048 x 2048 memory banks, the area overhead for address correction is less than 33%, delivering net memory densities around 10^11 b/cm^2

Caltech Authors

Fault-tolerant sub-lithographic design with rollback recovery

Author: DeHon André
Naeimi Helia
Publication venue: 'AIP Publishing'
Publication date: 19/03/2008
Field of study

Shrinking feature sizes and energy levels coupled with high clock rates and decreasing node capacitance lead us into a regime where transient errors in logic cannot be ignored. Consequently, several recent studies have focused on feed-forward spatial redundancy techniques to combat these high transient fault rates. To complement these studies, we analyze fine-grained rollback techniques and show that they can offer lower spatial redundancy factors with no significant impact on system performance for fault rates up to one fault per device per ten million cycles of operation (Pf = 10^-7) in systems with 10^12 susceptible devices. Further, we concretely demonstrate these claims on nanowire-based programmable logic arrays. Despite expensive rollback buffers and general-purpose, conservative analysis, we show the area overhead factor of our technique is roughly an order of magnitude lower than a gate level feed-forward redundancy scheme

Caltech Authors

ScholarlyCommons@Penn

Fault Secure Encoder and Decoder for NanoMemory Applications

Author: DeHon André
Naeimi Helia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Memory cells have been protected from soft errors for more than a decade; due to the increase in soft error rate in logic circuits, the encoder and decoder circuitry around the memory blocks have become susceptible to soft errors as well and must also be protected. We introduce a new approach to design fault-secure encoder and decoder circuitry for memory designs. The key novel contribution of this paper is identifying and defining a new class of error-correcting codes whose redundancy makes the design of fault-secure detectors (FSD) particularly simple. We further quantify the importance of protecting encoder and decoder circuitry against transient errors, illustrating a scenario where the system failure rate (FIT) is dominated by the failure rate of the encoder and decoder. We prove that Euclidean geometry low-density parity-check (EG-LDPC) codes have the fault-secure detector capability. Using some of the smaller EG-LDPC codes, we can tolerate bit or nanowire defect rates of 10% and fault rates of 10^(-18) upsets/device/cycle, achieving a FIT rate at or below one for the entire memory system and a memory density of 10^(11) bit/cm^2 with nanowire pitch of 10 nm for memory blocks of 10 Mb or larger. Larger EG-LDPC codes can achieve even higher reliability and lower area overhead

Caltech Authors

Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors

Author: DeHon André
Kapre Nachiket
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor organizations for data-parallel, floating-point computation in SPICE model-evaluation. Our Verilog AMS compiler produces code for parallel evaluation of non-linear circuit models suitable for use in SPICE simulations where the same model is evaluated several times for all the devices in the circuit. Our compiler uses architecture specific parallelization strategies (OpenMP for multi-core, PThreads for Cell, CUDA for GPU, statically scheduled VLIW for FPGA) when producing code for these different architectures. We automatically explore different implementation configurations (e.g. unroll factor, vector length) using our performance-tuner to identify the best possible configuration for each architecture. We demonstrate speedups of 3- 182times for a Xilinx Virtex5 LX 330T, 1.3-33times for an IBM Cell, and 3-131times for an NVIDIA 9600 GT GPU over a 3 GHz Intel Xeon 5160 implementation for a variety of single-precision device models

Crossref

Caltech Authors

DR-NTU (Digital Repository of NTU)

Flood routing of the Maja outflow across Xanthe Terra

Author: Dehon R. A.
Publication venue
Publication date
Field of study

The object is to trace a single flood crest through the Maja outflow system and to evaluate the effects of topography on ponding and multiple channel routing. Maja Valles provides a good model because it has a single source and a well defined channel system. The 1500 km long Maja Valles originates in Juventae Chasma. The outflow system stretches 1100 km northward along the Lunae Planum/Xanthe Terra boundary, then eastward across the Xanthe Terra highlands. It descends to Chryse Planitia where it extends northeastward toward the middle of the basin. It is concluded that flood routing through multiple channels and retardation in local impoundments are responsible for breakup of the initial flood crest and the formation of multiple flood crests. Recombined flow near the mouths of these canyons results in an extended flow regime and multiple flood surges. As a result of ponding along the flood course, depositional sites are localized and renewed erosion downstream (from ponded sites) results in sediment source areas not greatly removed from depositional sites

NASA Technical Reports Server

Optimistic Parallelization of Floating-Point Accumulation

Author: DeHon André
Kapre Nachiket
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are "mostly" associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision floating-point adders on the Virtex-4 LX160 (-12) device where each floating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we demonstrate an average speedup of 6× with randomly generated data and 3-7× with summations extracted from Conjugate Gradient benchmarks

CiteSeerX

Crossref

Caltech Authors

ScholarlyCommons@Penn

Seven strategies for tolerating highly defective fabrication

Author: DeHon André
Naeimi Helia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2005
Field of study

In this article we present an architecture that supports fine-grained sparing and resource matching. The base logic structure is a set of interconnected PLAs. The PLAs and their interconnections consist of large arrays of interchangeable nanowires, which serve as programmable product and sum terms and as programmable interconnect links. Each nanowire can have several defective programmable junctions. We can test nanowires for functionality and use only the subset that provides appropriate conductivity and electrical characteristics. We then perform a matching between nanowire junction programmability and application logic needs to use almost all the nanowires even though most of them have defective junctions. We employ seven high-level strategies to achieve this level of defect tolerance

Caltech Authors

Photogeologic mapping of Meridiani Sinus region from Mariner 6 and 7 imagery

Author: Dehon R. A.
Publication venue
Publication date
Field of study

Photogeologic mapping of Mariner photographs characterizes major stratigraphic units of the Martian equatorial region. The low resolution photomosaic ship across Meridiani Sinus Deucalionus Regio is divided into regional map units based on albedo and crater density. High resolution frames reveal map units defined by varying surface texture, crater densities, and degree of crater sharpness. Mariner photographs provide clear evidence of eolian action and channelization by fluid flow

NASA Technical Reports Server