404 research outputs found
Low power signal processing research at Stanford
This paper gives an overview of the research being conducted at Stanford University's Space, Telecommunications, and Radioscience Laboratory in the area of low energy computation. It discusses the work we are doing in large scale digital VLSI neural networks, interleaved processor and pipelined memory architectures, energy estimation and optimization, multichip module packaging, and low voltage digital logic
Pulse propagation, graph cover, and packet forwarding
We study distributed systems, with a particular focus on graph problems and fault tolerance. Fault-tolerance in a microprocessor or even System-on-Chip can be improved by using a fault-tolerant pulse propagation design. The existing design TRIX achieves this goal by being a distributed system consisting of very simple nodes. We show that even in the typical mode of operation without faults, TRIX performs significantly better than a regular wire or clock tree: Statistical evaluation of our simulated experiments show that we achieve a skew with standard deviation of O(log log H), where H is the height of the TRIX grid. The distance-r generalization of classic graph problems can give us insights on how distance affects hardness of a problem. For the distance-r dominating set problem, we present both an algorithmic upper and unconditional lower bound for any graph class with certain high-girth and sparseness criteria. In particular, our algorithm achieves a O(r·f(r))-approximation in time O(r), where f is the expansion function, which correlates with density. For constant r, this implies a constant approximation factor, in constant time. We also show that no algorithm can achieve a (2r + 1 − δ)-approximation for any δ > 0 in time O(r), not even on the class of cycles of girth at least 5r. Furthermore, we extend the algorithm to related graph cover problems and even to a different execution model. Furthermore, we investigate the problem of packet forwarding, which addresses the question of how and when best to forward packets in a distributed system. These packets are injected by an adversary. We build on the existing algorithm OED to handle more than a single destination. In particular, we show that buffers of size O(log n) are sufficient for this algorithm, in contrast to O(n) for the naive approach.Wir untersuchen verteilte Systeme, mit besonderem Augenmerk auf Graphenprobleme und Fehlertoleranz. Fehlertoleranz auf einem System-on-Chip (SoC) kann durch eine fehlertolerante Puls- Weiterleitung verbessert werden. Das bestehende Puls-Weiterleitungs-System TRIX toleriert Fehler indem es ein verteiltes System ist das nur aus sehr einfachen Knoten besteht. Wir zeigen dass selbst im typischen, fehlerfreien Fall TRIX sich weitaus besser verhält als man naiverweise erwarten würde: Statistische Analysen unserer simulierten Experimente zeigen, dass der Verzögerungs-Unterschied eine Standardabweichung von lediglich O(log logH) erreicht, wobei H die Höhe des TRIX-Netzes ist. Das Generalisieren einiger klassischer Graphen-Probleme auf Distanz r kann uns neue Erkenntnisse bescheren über den Zusammenhang zwischen Distanz und Komplexität eines Problems. Für das Problem der dominierenden Mengen auf Distanz r zeigen wir sowohl eine algorithmische obere Schranke als auch eine bedingungsfreie untere Schranke für jede Klasse von Graphen, die bestimmte Eigenschaften an Umfang und Dichte erfüllt. Konkret erreicht unser Algorithmus in Zeit O(r) eine Annäherungsgüte von O(r · f(r)). Für konstante r bedeutet das, dass der Algorithmus in konstanter Zeit eine Annäherung konstanter Güte erreicht. Weiterhin zeigen wir, dass kein Algorithmus in Zeit O(r) eine Annäherungsgüte besser als 2r + 1 erreichen kann, nicht einmal in der Klasse der Kreis-Graphen von Umfang mindestens 5r. Weiterhin haben wir das Paketweiterleitungs-Problem untersucht, welches sich mit der Frage beschäftigt, wann genau Pakete in einem verteilten System idealerweise weitergeleitetwerden sollten. Die Paketewerden dabei von einem Gegenspieler eingefügt. Wir bauen auf dem existierenden Algorithmus OED auf, um mehr als ein Paket-Ziel beliefern zu können. Dadurch zeigen wir, dass Paket-Speicher der Größe O(log n) für dieses Problem ausreichen, im Gegensatz zu den Paket-Speichern der Größe O(n) die für einen naiven Ansatz nötig wären
Investigation of Molecular FCN for Beyond-CMOS: Technology, design, and modeling for nanocomputing
L'abstract è presente nell'allegato / the abstract is in the attachmen
Recommended from our members
Surpassing Fundamental Limits through Time Varying Electromagnetics
Surpassing the fundamental limits that govern all electromagnetic structures, such as reciprocity and the delay-bandwidth-size limit, will have a transformative impact on all applications based on electromagnetic circuits and systems. For instance, violating principles of reciprocity enables non-reciprocal components such as isolators and circulators, which find application in full-duplex wireless radios, radar, biomedical imaging, and quantum computing systems. Overcoming the delay-bandwidth-size limit enables ultra-broadband yet extremely-compact devices whose size is not fundamentally related to the wavelength at the operating frequency. The focus of this dissertation is on using time-variance as a new toolbox to overcome these fundamental limits and re-imagine circuit and system design.
Traditional non-reciprocal components are realized using ferrite materials that loose their reciprocity under the application of external magnetic bias. However, the sheer volume, cost and weight of these magnet based non-reciprocal components coupled with their inability to be fabricated in conventional semiconductor processes, have limited their application to bulky and large-scale systems. Other approaches such as active-biased and non-linearity based non-reciprocity are compatible with semiconductor processes, however, they suffer from other poor linearity and noise performance. In this dissertation, using passive transistor switch as the modulating element, we have proposed the concept of spatio-temporal conductivity modulation and have demonstrated a gamut of non-reciprocal devices ranging from gyrators to isolators and circulators. Through novel circuit topologies, for the first time, we have demonstrated on-chip circulators with multi-watt input power handling, operation at high millimeter-wave frequencies, and tailor made circulators for emerging technologies such as simultaneous-transmit-and-receive MRI and quantum computing.
Delay-bandwidth-size trade-off is another fundamental electromagnetic limit, that constrains the delay imparted by a medium or a device within a fixed footprint to be inversely proportional to the signal bandwidth. It is this limit that governs the size of any microwave passive devices to be inversely proportional to its operating frequency. As a part of this dissertation, through intelligent clocking of switched capacitor networks we overcame the delay-bandwidth-size limit, thus resulting in infinitesimal, yet broadband microwave devices. Here we proposed a new paradigm in wave propagation where the properties such as the propagation delay and characteristic impedance does not depend on the constituent elements/materials of the medium, but rather heavily rely on the user-defined modulation scheme, thereby opening huge opportunities for realizing highly-reconfigurable passives. Leveraging these concepts, we demonstrated wide range of reciprocal an non-reciprocal devices including ultra-compact delay elements, highly-reconfigurable microwave passives, ultra-wideband circulators with infinitesimal form-factors and dispersion-free chip scale floquet topological insulators. Application of these devices have also been evaluated in real-world systems through our demonstrations of wideband, full-duplex receivers leveraging switched capacitors based true-time-delay interference cancelers and floquet topological insulator based antenna interfaces for full-duplex phased-arrays and ultra-wideband beamformers.
Furthermore, to cater the growing RF and microwave needs of future, large-scale quantum computing systems, we demonstrated a low-cryogenic, wideband circulator based on time modulation of superconducting devices. This superconducting circulator is expected to operate alongside the superconducting qubits, inside a dilution refrigerator at 10mK-100mK, thus enabling a tightly integrated quantum system. We also presented the design and implementation of a cryogenic-CMOS clock driver chip that will generate the clocks required by the superconducting circulator. Finally, we also demonstrated the design and implementation of a low-noise, low power consumption, 6GHz - 8GHz cryogenic downconversion receiver at 4K for cryogenic qubit readout
Utilizing Magnetic Tunnel Junction Devices in Digital Systems
The research described in this dissertation is motivated by the desire to effectively utilize magnetic tunnel junctions (MTJs) in digital systems. We explore two aspects of this: (1) a read circuit useful for global clocking and magnetologic, and (2) hardware virtualization that utilizes the deeply-pipelined nature of magnetologic.
In the first aspect, a read circuit is used to sense the state of an MTJ (low or high resistance) and produce a logic output that represents this state. With global clocking, an external magnetic field combined with on-chip MTJs is used as an alternative mechanism for distributing the clock signal across the chip. With magnetologic, logic is evaluated with MTJs that must be sensed by a read circuit and used to drive downstream logic. For these two uses, we develop a resistance-to-voltage (R2V) read circuit to sense MTJ resistance and produce a logic voltage output. We design and fabricate a prototype test chip in the 3 metal 2 poly 0.5 um process for testing the R2V read circuit and experimentally validating its correctness. Using a clocked low/high resistor pair, we show that the read circuit can correctly detect the input resistance and produce the desired square wave output. The read circuit speed is measured to operate correctly up to 48 MHz. The input node is relatively insensitive to node capacitance and can handle up to 10s of pF of capacitance without changing the bandwidth of the circuit.
In the second aspect, hardware virtualization is a technique by which deeply-pipelined circuits that have feedback can be utilized. MTJs have the potential to act as state in a magnetologic circuit which may result in a deep pipeline. Streams of computation are then context switched into the hardware logic, allowing them to share hardware resources and more fully utilize the pipeline stages of the logic. While applicable to magnetologic using MTJs, virtualization is also applicable to traditional logic technologies like CMOS. Our investigation targets MTJs, FPGAs, and ASICs. We develop M/D/1 and M/G/1 queueing models of the performance of virtualized hardware with secondary memory using a fixed, hierarchical, round-robin schedule that predict average throughput, latency, and queue occupancy in the system. We develop three C-slow applications and calibrate them to a clock and resource model for FPGA and ASIC technologies. Last, using the M/G/1 model, we predict throughput, latency, and resource usage for MTJ, FPGA, and ASIC technologies. We show three design scenarios illustrating ways in which to use the model
Center for Space Microelectronics Technology
The 1990 technical report of the Jet Propulsion Laboratory Center for Space Microelectronics Technology summarizes the technical accomplishments, publications, presentations, and patents of the center during 1990. The report lists 130 publications, 226 presentations, and 87 new technology reports and patents
The connection machine
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D
- …