97 research outputs found

    Limits on Fundamental Limits to Computation

    Full text link
    An indispensable part of our lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the last fifty years. Such Moore scaling now requires increasingly heroic efforts, stimulating research in alternative hardware and stirring controversy. To help evaluate emerging technologies and enrich our understanding of integrated-circuit scaling, we review fundamental limits to computation: in manufacturing, energy, physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, we recall how some limits were circumvented, compare loose and tight limits. We also point out that engineering difficulties encountered by emerging technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl

    Using Physical Compilation to Implement a System on Chip Platform

    Get PDF
    The goal of this thesis was to setup a complete design flow involving physical synthesis. The design chosen for this purpose was a system-on-chip (SoC) platform developed at the University of Tennessee. It involves a Leon Processor with a minimal cache configuration, an AMBA on-chip bus and an Advanced Encryption Standard module which performs decryption. As transistor size has entered the deep submicron level, iterations involved in the design cycle have increased due to the domination of interconnect delays over cell delays. Traditionally, interconnect delay has been estimated through the use of wire-load models. However, since there is no physical placement information, the delay estimation may be ineffective and result in increased iterations. Hence, placement-based synthesis has recently been introduced to provide better interconnect delay estimation. The tool used in this thesis to implement the system-on-chip design using physical synthesis is Synopsys Physical Compiler. The flow has been setup through the use of the Galaxy Reference Flow scripts obtained from Synopsys. As part of the thesis, an analysis of the differences between a physically synthesized design and a logically synthesized one in terms of area and delay is presented

    Study and analysis of innovative network protocols and architectures

    Get PDF
    In the last years, some new paradigms are emerging in the networking area as inspiring models for the definition of future communications networks. A key example is certainly the Content Centric Networking (CCN) protocol suite, namely a novel network architecture that aims to supersede the current TCP/IP stack in favor of a name based routing algorithm, also introducing in-network caching capabilities. On the other hand, much interest has been placed on Software Defined Networking (SDN), namely the set of protocols and architectures designed to make network devices more dynamic and programmable. Given this complex arena, the thesis focuses on the analysis of these innovative network protocols, with the aim of exploring possible design flaws and hence guaranteeing their proper operation when actually deployed in the network. Particular emphasis is given to the security of these protocols, for its essential role in every wide scale application. Some work has been done in this direction, but all these solutions are far to be considered fully investigated. In the CCN case, a closer investigation on problems related to possible DDoS attacks due to the stateful nature of the protocol, is presented along with a full-fledged proposal to support scalable PUSH application on top of CCN. Concerning SDN, instead, we present a tool for the verification of network policies in complex graphs containing dynamic network functions. In order to obtain significant results, we leverage different tools and methodologies: on the one hand, we assess simulation software as very useful tools for representing the most common use cases for the various technologies. On the other hand, we exploit more sophisticated formal methods to ensure a higher level of confidence for the obtained results

    Proceedings of the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications

    Get PDF
    The proceedings of the National Space Science Data Center Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications held July 23 through 25, 1991 at the NASA/Goddard Space Flight Center are presented. The program includes a keynote address, invited technical papers, and selected technical presentations to provide a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disk and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's

    A configurable vector processor for accelerating speech coding algorithms

    Get PDF
    The growing demand for voice-over-packer (VoIP) services and multimedia-rich applications has made increasingly important the efficient, real-time implementation of low-bit rates speech coders on embedded VLSI platforms. Such speech coders are designed to substantially reduce the bandwidth requirements thus enabling dense multichannel gateways in small form factor. This however comes at a high computational cost which mandates the use of very high performance embedded processors. This thesis investigates the potential acceleration of two major ITU-T speech coding algorithms, namely G.729A and G.723.1, through their efficient implementation on a configurable extensible vector embedded CPU architecture. New scalar and vector ISAs were introduced which resulted in up to 80% reduction in the dynamic instruction count of both workloads. These instructions were subsequently encapsulated into a parametric, hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research and implementation of the vector datapath of this vector coprocessor which is tightly-coupled to a Sparc-V8 compliant CPU, the optimization and simulation methodologies employed and the use of Electronic System Level (ESL) techniques to rapidly design SIMD datapaths

    Timing model derivation : pipeline analyzer generation from hardware description languages

    Get PDF
    Safety-critical systems are forced to finish their execution within strict deadlines so that worst-case execution time (WCET) guarantees are a crucial part of their verification. Timing models of the analyzed hardware form the basis for static analysis-based approaches like the aiT WCET analyzer. Currently, timing models are hand-crafted based on frequently incorrect documentation causing the process to be error-prone and time-consuming. This thesis bridges the gap between automatic hardware synthesis and WCET analysis development by introducing a process for the derivation of timing models from VHDL specifications. We propose a set of transformations and abstractions to reduce the hardware design\u27s complexity enabling the generation of efficient and provably correct WCET analyzers. They employ an abstract interpretation-based simulation of program executions based on a defined abstract simulation semantics. We have defined workflow patterns showing how to gradually apply the derivation process to VHDL models, thereby removing timing-irrelevant constructs. Interval property checking is used to validate the transformations. A further contribution of this thesis is the implementation of a tool set that realizes the introduced derivation process and shows its applicability to non-trivial industrial designs in experimental evaluations. Influences on design choices to the quality of the derived timing model are presented building an informal predictability notion for VHDL.Sicherheits-kritische Systeme unterliegen oft der Einhaltung strikter Laufzeitschranken, weshalb zur Verifikation sichere Obergrenzen der Laufzeit im schlimmsten Fall (WCET) bestimmt werden. Zeitmodelle der analysierten Hardware sind hierbei die Grundlage für auf statischen Analysen basierende Verfahren. Aktuell werden solche Modelle händisch aus Handbüchern extrahiert, ein sehr zeitaufwändiger und fehleranfälliger Prozess. Diese Arbeit schlägt eine Brücke zwischen automatischer Hardware-Synthese und der Entwicklung von WCET-Analysen durch die Einführung eines Ableitungsprozesses von Zeitmodellen aus VHDL-Spezifikationen. Transformationen und Abstraktionen werden zur Komplexitätsreduktion eingesetzt, um die Erzeugung von effizienten und beweisbar korrekten Analysatoren zu ermöglichen. Selbige bedienen sich abstrakter Interpretation von Programmausführungen basierend auf einer Simulations-Semantik. Definierte Arbeitsabläufe zeigen, wie man die Ableitung schrittweise auf VHDL-Modellen umsetzt und dadurch für das Zeitverhalten irrelevante Teile des Modells entfernt. Interval Property Checking gewährleistet hierbei, dass die Transformationen semantik-erhaltend sind. Eine Tool-Implementierung realisiert den vorgestellen Ableitungsprozess und unterstreicht seine Anwendbarkeit auf komplexe industrielle Designs durch experimentelle untersuchungen. Außerdem werden VHDL-Designentscheidungen hinsicht ihres Einflusses auf die Qualität des abgeleiteten Zeitmodells betrachtet

    Improving the Scalability of High Performance Computer Systems

    Full text link
    Improving the performance of future computing systems will be based upon the ability of increasing the scalability of current technology. New paths need to be explored, as operating principles that were applied up to now are becoming irrelevant for upcoming computer architectures. It appears that scaling the number of cores, processors and nodes within an system represents the only feasible alternative to achieve Exascale performance. To accomplish this goal, we propose three novel techniques addressing different layers of computer systems. The Tightly Coupled Cluster technique significantly improves the communication for inter node communication within compute clusters. By improving the latency by an order of magnitude over existing solutions the cost of communication is considerably reduced. This enables to exploit fine grain parallelism within applications, thereby, extending the scalability considerably. The mechanism virtually moves the network interconnect into the processor, bypassing the latency of the I/O interface and rendering protocol conversions unnecessary. The technique is implemented entirely through firmware and kernel layer software utilizing off-the-shelf AMD processors. We present a proof-of-concept implementation and real world benchmarks to demonstrate the superior performance of our technique. In particular, our approach achieves a software-to-software communication latency of 240 ns between two remote compute nodes. The second part of the dissertation introduces a new framework for scalable Networks-on-Chip. A novel rapid prototyping methodology is proposed, that accelerates the design and implementation substantially. Due to its flexibility and modularity a large application space is covered ranging from Systems-on-chip, to high performance many-core processors. The Network-on-Chip compiler enables to generate complex networks in the form of synthesizable register transfer level code from an abstract design description. Our engine supports different target technologies including Field Programmable Gate Arrays and Application Specific Integrated Circuits. The framework enables to build large designs while minimizing development and verification efforts. Many topologies and routing algorithms are supported by partitioning the tasks into several layers and by the introduction of a protocol agnostic architecture. We provide a thorough evaluation of the design that shows excellent results regarding performance and scalability. The third part of the dissertation addresses the Processor-Memory Interface within computer architectures. The increasing compute power of many-core processors, leads to an equally growing demand for more memory bandwidth and capacity. Current processor designs exhibit physical limitations that restrict the scalability of main memory. To address this issue we propose a memory extension technique that attaches large amounts of DRAM memory to the processor via a low pin count interface using high speed serial transceivers. Our technique transparently integrates the extension memory into the system architecture by providing full cache coherency. Therefore, applications can utilize the memory extension by applying regular shared memory programming techniques. By supporting daisy chained memory extension devices and by introducing the asymmetric probing approach, the proposed mechanism ensures high scalability. We furthermore propose a DMA offloading technique to improve the performance of the processor memory interface. The design has been implemented in a Field Programmable Gate Array based prototype. Driver software and firmware modifications have been developed to bring up the prototype in a Linux based system. We show microbenchmarks that prove the feasibility of our design

    Many-core architectures with time predictable execution Support for hard real-time applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 183-193).Hybrid control systems are a growing domain of application. They are pervasive and their complexity is increasing rapidly. Distributed control systems for future "Intelligent Grid" and renewable energy generation systems are demanding high-performance, hard real-time computation, and more programmability. General-purpose computer systems are primarily designed to process data and not to interact with physical processes as required by these systems. Generic general-purpose architectures even with the use of real-time operating systems fail to meet the hard realtime constraints of hybrid system dynamics. ASIC, FPGA, or traditional embedded design approaches to these systems often result in expensive, complicated systems that are hard to program, reuse, or maintain. In this thesis, we propose a domain-specific architecture template targeting hybrid control system applications. Using power electronics control applications, we present new modeling techniques, synthesis methodologies, and a parameterizable computer architecture for these large distributed control systems. We propose a new system modeling approach, called Adaptive Hybrid Automaton, based on previous work in control system theory, that uses a mixed-model abstractions and lends itself well to digital processing. We develop a domain-specific architecture based on this modeling that uses heterogeneous processing units and predictable execution, called MARTHA. We develop a hard real-time aware router architecture to enable deterministic on-chip interconnect network communication. We present several algorithms for scheduling task-based applications onto these types of heterogeneous architectures. We create Heracles, an open-source, functional, parameterized, synthesizable many-core system design toolkit, that can be used to explore future multi/many-core processors with different topologies, routing schemes, processing elements or cores, and memory system organizations. Using the Heracles design tool we build a prototype of the proposed architecture using a state-of-the-art FPGA-based platform, and deploy and test it in actual physical power electronics systems. We develop and release an open-source, small representative set of power electronics system applications that can be used for hard real-time application benchmarking.by Michel A. Kinsy.Ph.D

    Cloud migration of legacy applications

    Get PDF

    ポータビリティを意識したCMOSミックスドシグナルVLSI回路設計手法に関する研究

    Get PDF
    本研究は、半導体上に集積されたアナログ・ディジタル・メモリ回路から構成されるミクストシグナルシステムを別の製造プロセスへ移行することをポーティングとして定義し、効率的なポーティングを行うための設計方式と自動回路合成アルゴリズムを提案し、いくつかの典型的な回路に対する設計事例を示し、提案手法の妥当性を立証している。北九州市立大
    corecore