5 research outputs found

    Hi-Rise: A high-radix switch for 3D integration with single-cycle arbitration

    Get PDF
    Abstract-This paper proposes a novel 3D switch, called 'HiRise', that employs high-radix switches to efficiently route data across multiple stacked layers of dies. The proposed interconnect is hierarchical and composed of two switches per silicon layer and a set of dedicated layer to layer channels. However, a hierarchical 3D switch can lead to unfair arbitration across different layers. To address this, the paper proposes a unique class-based arbitration scheme that is fully integrated into the switching fabric, and is easy to implement. It makes the 3D hierarchical switch's fairness comparable to that of a flat 2D switch with least recently granted arbitration. The 3D switch is evaluated for different radices, number of stacked layers, and different 3D integration technologies. A 64-radix, 128-bit width, 4-layer Hi-Rise evaluated in a 32nm technology has a throughput of 10.65 Tbps for uniform random traffic. Compared to a 2D design this corresponds to a 15% improvement in throughput, a 33% area reduction, a 20% latency reduction, and a 38% energy per transaction reduction

    Cross-point Circuits for Computation, Interconnects, Security and Storage

    Full text link
    Limited supply-voltage scaling in newer semiconductor technology nodes, has led to an increase in power-density and stagnation of the clock frequency of microprocessors. To overcome this challenge, the trend for performance intensive parts has been to replace the power-hungry, high-speed cores with a number of power-efficient simpler cores, to form a many-core system. For many-core systems the major challenge is the need to share data and hand-shaking signals between the cores working in parallel. On the other hand, battery operated mobile systems have a much lower power budget but still need higher performance during active mode. For mobile systems, despite the use of aggressive voltage-frequency scaling techniques for energy efficiency, the conventional architecture still requires a large amount of energy for data movement between the core and the memory as compared to the actual computation. Finally, on the lowest end of the energy spectrum are systems for Internet-of-Things (IoT), which mainly consist of sensor nodes that intermittently wake-up and log sensed data. Reducing data logging energy is a major challenge for IoT systems. This dissertation presents cross-point circuit-based solutions for improved energy-efficiency for each of the three design types - many-core systems, mobile systems and IoT systems. Cross-point circuits are array based circuits composed of small unit blocks. These unit blocks can be easily optimized for both area and energy, and then tiled together to form large scalable circuits. First, this dissertation presents a cross-point based interconnect, Hi-Rise, which is a low latency, area-energy efficient 3D switch for efficient communication between large number of cores, in a many-core system. Second, it presents a configurable SRAM circuit, with capabilities to perform search (content-addressable-memory) and logical functions within the memory using standard SRAM bit-cells to reduce the data movement. It also can be repurposed as a Physically-Unclonable-Function (PUF) for hardware authentication. Finally, this dissertation presents circuit design for two non-volatile memory solutions optimized for ultra-low energy data logging in IoT systems. The proposed flash memory design has an ultra-wide 1Kb/program cycle, enabled by charge sharing and charge recycling, whereas the adiabatic Ferroelectric-RAM uses a resonating sine wave for low-energy memory access.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137048/1/sjeloka_1.pd
    corecore