751 research outputs found

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    Channel Characterization for Chip-scale Wireless Communications within Computing Packages

    Get PDF
    Wireless Network-on-Chip (WNoC) appears as a promising alternative to conventional interconnect fabrics for chip-scale communications. WNoC takes advantage of an overlaid network composed by a set of millimeter-wave antennas to reduce latency and increase throughput in the communication between cores. Similarly, wireless inter-chip communication has been also proposed to improve the information transfer between processors, memory, and accelerators in multi-chip settings. However, the wireless channel remains largely unknown in both scenarios, especially in the presence of realistic chip packages. This work addresses the issue by accurately modeling flip-chip packages and investigating the propagation both its interior and its surroundings. Through parametric studies, package configurations that minimize path loss are obtained and the trade-offs observed when applying such optimizations are discussed. Single-chip and multi-chip architectures are compared in terms of the path loss exponent, confirming that the amount of bulk silicon found in the pathway between transmitter and receiver is the main determinant of losses.Comment: To be presented 12th IEEE/ACM International Symposium on Networks-on-Chip (NOCS 2018); Torino, Italy; October 201

    Wavelength-multiplexed duplex transceiver based on III-V/Si hybrid integration for off-chip and on-chip optical interconnects

    Get PDF
    A six-channel wavelength-division-multiplexed optical transceiver with a compact footprint of 1.5 x 0.65 mm(2) for off-chip and on-chip interconnects is demonstrated on a single silicon-on-insulator chip. An arrayed waveguide grating is used as the (de)multiplexer, and III-V electroabsorption sections fabricated by hybrid integration technology are used as both modulators and detectors, which also enable duplex links. The 30-Gb/s capacity for each of the six wavelength channels for the off-chip transceiver is demonstrated. For the on-chip interconnect, an electrical-to-electrical 3-dB bandwidth of 13 GHz and a data rate of 30 Gb/s per wavelength are achieved

    A Performance Comparison Using HPC Benchmarks: Windows HPC Server 2008 and Red Hat Enterprise Linux 5

    Get PDF
    This document was developed with support from the National Science Foundation (NSF) under Grant No. 0910812 to Indiana University for ”FutureGrid: An Experimental, High-Performance Grid Test-bed.” Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.A collection of performance benchmarks have been run on an IBM System X iDataPlex cluster using two different operating systems. Windows HPC Server 2008 (WinHPC) and Red Hat Enterprise Linux v5.4 (RHEL5) are compared using SPEC MPI2007 v1.1, the High Performance Computing Challenge (HPCC) and National Science Foundation (NSF) acceptance test benchmark suites. Overall, we find the performance of WinHPC and RHEL5 to be equivalent but significant performance differences exist when analyzing specific applications. We focus on presenting the results from the application benchmarks and include the results of the HPCC microbenchmark for completeness

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures
    corecore