404 research outputs found

    Low power signal processing research at Stanford

    Get PDF
    This paper gives an overview of the research being conducted at Stanford University's Space, Telecommunications, and Radioscience Laboratory in the area of low energy computation. It discusses the work we are doing in large scale digital VLSI neural networks, interleaved processor and pipelined memory architectures, energy estimation and optimization, multichip module packaging, and low voltage digital logic

    Pulse propagation, graph cover, and packet forwarding

    Get PDF
    We study distributed systems, with a particular focus on graph problems and fault tolerance. Fault-tolerance in a microprocessor or even System-on-Chip can be improved by using a fault-tolerant pulse propagation design. The existing design TRIX achieves this goal by being a distributed system consisting of very simple nodes. We show that even in the typical mode of operation without faults, TRIX performs significantly better than a regular wire or clock tree: Statistical evaluation of our simulated experiments show that we achieve a skew with standard deviation of O(log log H), where H is the height of the TRIX grid. The distance-r generalization of classic graph problems can give us insights on how distance affects hardness of a problem. For the distance-r dominating set problem, we present both an algorithmic upper and unconditional lower bound for any graph class with certain high-girth and sparseness criteria. In particular, our algorithm achieves a O(r·f(r))-approximation in time O(r), where f is the expansion function, which correlates with density. For constant r, this implies a constant approximation factor, in constant time. We also show that no algorithm can achieve a (2r + 1 − δ)-approximation for any δ > 0 in time O(r), not even on the class of cycles of girth at least 5r. Furthermore, we extend the algorithm to related graph cover problems and even to a different execution model. Furthermore, we investigate the problem of packet forwarding, which addresses the question of how and when best to forward packets in a distributed system. These packets are injected by an adversary. We build on the existing algorithm OED to handle more than a single destination. In particular, we show that buffers of size O(log n) are sufficient for this algorithm, in contrast to O(n) for the naive approach.Wir untersuchen verteilte Systeme, mit besonderem Augenmerk auf Graphenprobleme und Fehlertoleranz. Fehlertoleranz auf einem System-on-Chip (SoC) kann durch eine fehlertolerante Puls- Weiterleitung verbessert werden. Das bestehende Puls-Weiterleitungs-System TRIX toleriert Fehler indem es ein verteiltes System ist das nur aus sehr einfachen Knoten besteht. Wir zeigen dass selbst im typischen, fehlerfreien Fall TRIX sich weitaus besser verhält als man naiverweise erwarten würde: Statistische Analysen unserer simulierten Experimente zeigen, dass der Verzögerungs-Unterschied eine Standardabweichung von lediglich O(log logH) erreicht, wobei H die Höhe des TRIX-Netzes ist. Das Generalisieren einiger klassischer Graphen-Probleme auf Distanz r kann uns neue Erkenntnisse bescheren über den Zusammenhang zwischen Distanz und Komplexität eines Problems. Für das Problem der dominierenden Mengen auf Distanz r zeigen wir sowohl eine algorithmische obere Schranke als auch eine bedingungsfreie untere Schranke für jede Klasse von Graphen, die bestimmte Eigenschaften an Umfang und Dichte erfüllt. Konkret erreicht unser Algorithmus in Zeit O(r) eine Annäherungsgüte von O(r · f(r)). Für konstante r bedeutet das, dass der Algorithmus in konstanter Zeit eine Annäherung konstanter Güte erreicht. Weiterhin zeigen wir, dass kein Algorithmus in Zeit O(r) eine Annäherungsgüte besser als 2r + 1 erreichen kann, nicht einmal in der Klasse der Kreis-Graphen von Umfang mindestens 5r. Weiterhin haben wir das Paketweiterleitungs-Problem untersucht, welches sich mit der Frage beschäftigt, wann genau Pakete in einem verteilten System idealerweise weitergeleitetwerden sollten. Die Paketewerden dabei von einem Gegenspieler eingefügt. Wir bauen auf dem existierenden Algorithmus OED auf, um mehr als ein Paket-Ziel beliefern zu können. Dadurch zeigen wir, dass Paket-Speicher der Größe O(log n) für dieses Problem ausreichen, im Gegensatz zu den Paket-Speichern der Größe O(n) die für einen naiven Ansatz nötig wären

    Investigation of Molecular FCN for Beyond-CMOS: Technology, design, and modeling for nanocomputing

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Utilizing Magnetic Tunnel Junction Devices in Digital Systems

    Get PDF
    The research described in this dissertation is motivated by the desire to effectively utilize magnetic tunnel junctions (MTJs) in digital systems. We explore two aspects of this: (1) a read circuit useful for global clocking and magnetologic, and (2) hardware virtualization that utilizes the deeply-pipelined nature of magnetologic. In the first aspect, a read circuit is used to sense the state of an MTJ (low or high resistance) and produce a logic output that represents this state. With global clocking, an external magnetic field combined with on-chip MTJs is used as an alternative mechanism for distributing the clock signal across the chip. With magnetologic, logic is evaluated with MTJs that must be sensed by a read circuit and used to drive downstream logic. For these two uses, we develop a resistance-to-voltage (R2V) read circuit to sense MTJ resistance and produce a logic voltage output. We design and fabricate a prototype test chip in the 3 metal 2 poly 0.5 um process for testing the R2V read circuit and experimentally validating its correctness. Using a clocked low/high resistor pair, we show that the read circuit can correctly detect the input resistance and produce the desired square wave output. The read circuit speed is measured to operate correctly up to 48 MHz. The input node is relatively insensitive to node capacitance and can handle up to 10s of pF of capacitance without changing the bandwidth of the circuit. In the second aspect, hardware virtualization is a technique by which deeply-pipelined circuits that have feedback can be utilized. MTJs have the potential to act as state in a magnetologic circuit which may result in a deep pipeline. Streams of computation are then context switched into the hardware logic, allowing them to share hardware resources and more fully utilize the pipeline stages of the logic. While applicable to magnetologic using MTJs, virtualization is also applicable to traditional logic technologies like CMOS. Our investigation targets MTJs, FPGAs, and ASICs. We develop M/D/1 and M/G/1 queueing models of the performance of virtualized hardware with secondary memory using a fixed, hierarchical, round-robin schedule that predict average throughput, latency, and queue occupancy in the system. We develop three C-slow applications and calibrate them to a clock and resource model for FPGA and ASIC technologies. Last, using the M/G/1 model, we predict throughput, latency, and resource usage for MTJ, FPGA, and ASIC technologies. We show three design scenarios illustrating ways in which to use the model

    Center for Space Microelectronics Technology

    Get PDF
    The 1990 technical report of the Jet Propulsion Laboratory Center for Space Microelectronics Technology summarizes the technical accomplishments, publications, presentations, and patents of the center during 1990. The report lists 130 publications, 226 presentations, and 87 new technology reports and patents

    The connection machine

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.Bibliography: leaves 134-157.by William Daniel Hillis.Ph.D
    corecore