3,727 research outputs found
Recommended from our members
Skybridge: A New Nanoscale 3-D Computing Framework for Future Integrated Circuits
Continuous scaling of CMOS has been the major catalyst in miniaturization of integrated circuits (ICs) and crucial for global socio-economic progress. However, continuing the traditional way of scaling to sub-20nm technologies is proving to be very difficult as MOSFETs are reaching their fundamental performance limits [1] and interconnection bottleneck is dominating IC operational power and performance [2]. Migrating to 3-D, as a way to advance scaling, has been elusive due to inherent customization and manufacturing requirements in CMOS architecture that are incompatible with 3-D organization. Partial attempts with die-die [3] and layer-layer [4] stacking have their own limitations [5]. We propose a new 3-D IC fabric technology, Skybridge [6], which offers paradigm shift in technology scaling as well as design. We co-architect Skybridge’s core aspects, from device to circuit style, connectivity, thermal management, and manufacturing pathway in a 3-D fabric-centric manner, building on a uniform 3-D template. Our extensive bottom-up simulations, accounting for detailed material system structures, manufacturing process, device, and circuit parasitics, carried through for several designs including a designed microprocessor, reveal a 30-60x density, 3.5x performance/watt benefits, and 10x reduction in interconnect lengths vs. scaled 16-nm CMOS [6]. Fabric-level heat extraction features are found to be effective in managing IC thermal profiles in 3-D. This 3-D integrated fabric proposal overcomes the current impasse of CMOS in a manner that can be immediately adopted, and offers unique solution to continue technology scaling in the 21st century
Skybridge: 3-D Integrated Circuit Technology Alternative to CMOS
Continuous scaling of CMOS has been the major catalyst in miniaturization of
integrated circuits (ICs) and crucial for global socio-economic progress.
However, scaling to sub-20nm technologies is proving to be challenging as
MOSFETs are reaching their fundamental limits and interconnection bottleneck is
dominating IC operational power and performance. Migrating to 3-D, as a way to
advance scaling, has eluded us due to inherent customization and manufacturing
requirements in CMOS that are incompatible with 3-D organization. Partial
attempts with die-die and layer-layer stacking have their own limitations. We
propose a 3-D IC fabric technology, Skybridge[TM], which offers paradigm shift
in technology scaling as well as design. We co-architect Skybridge's core
aspects, from device to circuit style, connectivity, thermal management, and
manufacturing pathway in a 3-D fabric-centric manner, building on a uniform 3-D
template. Our extensive bottom-up simulations, accounting for detailed material
system structures, manufacturing process, device, and circuit parasitics,
carried through for several designs including a designed microprocessor, reveal
a 30-60x density, 3.5x performance per watt benefits, and 10X reduction in
interconnect lengths vs. scaled 16-nm CMOS. Fabric-level heat extraction
features are shown to successfully manage IC thermal profiles in 3-D. Skybridge
can provide continuous scaling of integrated circuits beyond CMOS in the 21st
century.Comment: 53 Page
FPGA BASED PARALLEL IMPLEMENTATION OF STACKED ERROR DIFFUSION ALGORITHM
Digital halftoning is a crucial technique used in digital printers to convert a continuoustone image into a pattern of black and white dots. Halftoning is used since printers have a limited availability of inks and cannot reproduce all the color intensities in a continuous image. Error Diffusion is an algorithm in halftoning that iteratively quantizes pixels in a neighborhood dependent fashion. This thesis focuses on the development and design of a parallel scalable hardware architecture for high performance implementation of a high quality Stacked Error Diffusion algorithm. The algorithm is described in ‘C’ and requires a significant processing time when implemented on a conventional CPU. Thus, a new hardware processor architecture is developed to implement the algorithm and is implemented to and tested on a Xilinx Virtex 5 FPGA chip. There is an extraordinary decrease in the run time of the algorithm when run on the newly proposed parallel architecture implemented to FPGA technology compared to execution on a single CPU. The new parallel architecture is described using the Verilog Hardware Description Language. Post-synthesis and post-implementation, performance based Hardware Description Language (HDL), simulation validation of the new parallel architecture is achieved via use of the ModelSim CAD simulation tool
Energy challenges for ICT
The energy consumption from the expanding use of information and communications technology (ICT) is unsustainable with present drivers, and it will impact heavily on the future climate change. However, ICT devices have the potential to contribute signi - cantly to the reduction of CO2 emission and enhance resource e ciency in other sectors, e.g., transportation (through intelligent transportation and advanced driver assistance systems and self-driving vehicles), heating (through smart building control), and manu- facturing (through digital automation based on smart autonomous sensors). To address the energy sustainability of ICT and capture the full potential of ICT in resource e - ciency, a multidisciplinary ICT-energy community needs to be brought together cover- ing devices, microarchitectures, ultra large-scale integration (ULSI), high-performance computing (HPC), energy harvesting, energy storage, system design, embedded sys- tems, e cient electronics, static analysis, and computation. In this chapter, we introduce challenges and opportunities in this emerging eld and a common framework to strive towards energy-sustainable ICT
Recommended from our members
Architecting NP-Dynamic Skybridge
With the scaling of technology nodes, modern CMOS integrated circuits face severe fundamental challenges that stem from device scaling limitations, interconnection bottlenecks and increasing manufacturing complexities. These challenges drive researchers to look for revolutionary technologies beyond the end of CMOS roadmap. Towards this end, a new nanoscale 3-D computing fabric for future integrated circuits, Skybridge, has been proposed [1]. In this new fabric, core aspects from device to circuit style, connectivity, thermal management and manufacturing pathway are co-architected in a 3-D fabric-centric manner.
However, the Skybridge fabric uses only n-type transistors in a dynamic circuit style for logic and memory implementations. Therefore, it requires complicated clocking schemes to overcome signal monotonicity associated with cascading dynamic logic gates. For Skybridge’s large-scale circuits, the dynamic circuit style requires cascaded stages to be micro-pipelined, which results in large number of buffers used for storing minterms causing significant overhead in terms of area and power. Moreover, implementation of logic is limited to NAND or AND-of-NAND based logic expressions, which does not always result in compact circuits. In this work, we propose an extension of original Skybridge fabric, called NP-Dynamic-Skybridge, to solve these challenges by using both n-and p-type transistors in an innovative circuit style. Here, every stage in a given circuit is implemented by either n-type or p-type dynamic logic.
Cascading n- and p-type dynamic logic effectively avoids signal monotonicity problem, and allows combinational-like circuit implementation. This helps to simplify the clocking scheme for cascaded logics requiring only one set of global precharge and evaluate clock signals. And also it expands the degree of expressing logic enabling expressions such as NOR, OR-of-NORs, in addition to those previously mentioned. Furthermore, the number of pipeline stages is significantly reduced for a given logic function, and buffer requirements are less compared with Skybridge 3D fabric thus improving on area and power metrics. Initial evaluation for NP-Dynamic-Skybridge’s 4-bit carry look-ahead adder shows up to 2x density benefits over Skybridge 3-D fabric and at least 17% power/throughput benefit
Main memory in HPC: do we need more, or could we live with less?
An important aspect of High-Performance Computing (HPC) system design is the choice of main memory capacity. This choice becomes increasingly important now that 3D-stacked memories are entering the market. Compared with conventional Dual In-line Memory Modules (DIMMs), 3D memory chiplets provide better performance and energy efficiency but lower memory capacities. Therefore, the adoption of 3D-stacked memories in the HPC domain depends on whether we can find use cases that require much less memory than is available now.
This study analyzes the memory capacity requirements of important HPC benchmarks and applications. We find that the High-Performance Conjugate Gradients (HPCG) benchmark could be an important success story for 3D-stacked memories in HPC, but High-Performance Linpack (HPL) is likely to be constrained by 3D memory capacity. The study also emphasizes that the analysis of memory footprints of production HPC applications is complex and that it requires an understanding of application scalability and target category, i.e., whether the users target capability or capacity computing. The results show that most of the HPC applications under study have per-core memory footprints in the range of hundreds of megabytes, but we also detect applications and use cases that require gigabytes per core. Overall, the study identifies the HPC applications and use cases with memory footprints that could be provided by 3D-stacked memory chiplets, making a first step toward adoption of this novel technology in the HPC domain.This work was supported by the Collaboration Agreement between Samsung Electronics Co., Ltd. and BSC, Spanish Government through Severo Ochoa programme (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project and by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272). This work has also received funding from the European Union’s Horizon
2020 research and innovation programme under ExaNoDe project (grant agreement No 671578). Darko Zivanovic holds the Severo Ochoa grant (SVP-2014-068501) of the Ministry of Economy and Competitiveness
of Spain. The authors thank Harald Servat from BSC and Vladimir Marjanovi´c from High Performance Computing Center Stuttgart for their technical support.Postprint (published version
- …