10 research outputs found

    Two-Layer Error Control Codes Combining Rectangular and Hamming Product Codes for Cache Error

    Get PDF
    We propose a novel two-layer error control code, combining error detection capability of rectangular codes and error correction capability of Hamming product codes in an efficient way, in order to increase cache error resilience for many core systems, while maintaining low power, area and latency overhead. Based on the fact of low latency and overhead of rectangular codes and high error control capability of Hamming product codes, two-layer error control codes employ simple rectangular codes for each cache line to detect cache errors, while loading the extra Hamming product code checks bits in the case of error detection; thus enabling reliable large-scale cache operations. Analysis and experiments are conducted to evaluate the cache fault-tolerant capability of various existing solutions and the proposed approach. The results show that the proposed approach can significantly increase Mean-Error-To-Failure (METF) and Mean-Time-To-failure (MTTF) up to 2.8Ă—, reduce storage overhead by over 57%, and increase instruction per-cycle (IPC) up to 7%, compared to complex four-way 4EC5ED; and it increases METF and MTTF up to 133Ă—, reduces storage overhead by over 11%, and achieves a similar IPC compared to simple eight-way single-error correcting double-error detecting (SECDED). The cost of the proposed approach is no more than 4% external memory access overhead

    Miniaturized Transistors

    Get PDF
    What is the future of CMOS? Sustaining increased transistor densities along the path of Moore's Law has become increasingly challenging with limited power budgets, interconnect bandwidths, and fabrication capabilities. In the last decade alone, transistors have undergone significant design makeovers; from planar transistors of ten years ago, technological advancements have accelerated to today's FinFETs, which hardly resemble their bulky ancestors. FinFETs could potentially take us to the 5-nm node, but what comes after it? From gate-all-around devices to single electron transistors and two-dimensional semiconductors, a torrent of research is being carried out in order to design the next transistor generation, engineer the optimal materials, improve the fabrication technology, and properly model future devices. We invite insight from investigators and scientists in the field to showcase their work in this Special Issue with research papers, short communications, and review articles that focus on trends in micro- and nanotechnology from fundamental research to applications

    Miniaturized Transistors, Volume II

    Get PDF
    In this book, we aim to address the ever-advancing progress in microelectronic device scaling. Complementary Metal-Oxide-Semiconductor (CMOS) devices continue to endure miniaturization, irrespective of the seeming physical limitations, helped by advancing fabrication techniques. We observe that miniaturization does not always refer to the latest technology node for digital transistors. Rather, by applying novel materials and device geometries, a significant reduction in the size of microelectronic devices for a broad set of applications can be achieved. The achievements made in the scaling of devices for applications beyond digital logic (e.g., high power, optoelectronics, and sensors) are taking the forefront in microelectronic miniaturization. Furthermore, all these achievements are assisted by improvements in the simulation and modeling of the involved materials and device structures. In particular, process and device technology computer-aided design (TCAD) has become indispensable in the design cycle of novel devices and technologies. It is our sincere hope that the results provided in this Special Issue prove useful to scientists and engineers who find themselves at the forefront of this rapidly evolving and broadening field. Now, more than ever, it is essential to look for solutions to find the next disrupting technologies which will allow for transistor miniaturization well beyond silicon’s physical limits and the current state-of-the-art. This requires a broad attack, including studies of novel and innovative designs as well as emerging materials which are becoming more application-specific than ever before

    Evaluating Techniques for Wireless Interconnected 3D Processor Arrays

    Get PDF
    In this thesis the viability of a wireless interconnect network for a highly parallel computer is investigated. The main theme of this thesis is to project the performance of a wireless network used to connect the processors in a parallel machine of such design. This thesis is going to investigate new design opportunities a wireless interconnect network can offer for parallel computing. A simulation environment is designed and implemented to carry out the tests. The results have shown that if the available radio spectrum is shared effectively between building blocks of the parallel machine, there are substantial chances to achieve high processor utilisation. The results show that some factors play a major role in the performance of such a machine. The size of the machine, the size of the problem and the communication and computation capabilities of each element of the machine are among those factors. The results show these factors set a limit on the number of nodes engaged in some classes of tasks. They have shown promising potential for further expansion and evolution of our idea to new architectural opportunities, which is discussed by the end of this thesis. To build a real machine of this type the architects would need to solve a number of challenging problems including heat dissipation, delivering electric power and Chip/board design; however, these issues are not part of this thesis and will be tackled in future

    Characterization and optimization of network traffic in cortical simulation

    Get PDF
    Considering the great variety of obstacles the Exascale systems have to face in the next future, a deeper attention will be given in this thesis to the interconnect and the power consumption. The data movement challenge involves the whole hierarchical organization of components in HPC systems — i.e. registers, cache, memory, disks. Running scientific applications needs to provide the most effective methods of data transport among the levels of hierarchy. On current petaflop systems, memory access at all the levels is the limiting factor in almost all applications. This drives the requirement for an interconnect achieving adequate rates of data transfer, or throughput, and reducing time delays, or latency, between the levels. Power consumption is identified as the largest hardware research challenge. The annual power cost to operate the system would be above 2.5 B$ per year for an Exascale system using current technology. The research for alternative power-efficient computing device is mandatory for the procurement of the future HPC systems. In this thesis, a preliminary approach will be offered to the critical process of co-design. Co-desing is defined as the simultaneos design of both hardware and software, to implement a desired function. This process both integrates all components of the Exascale initiative and illuminates the trade-offs that must be made within this complex undertaking

    A High Performance Advanced Encryption Standard (AES) Encrypted On-Chip Bus Architecture for Internet-of-Things (IoT) System-on-Chips (SoC)

    Get PDF
    With industry expectations of billions of Internet-connected things, commonly referred to as the IoT, we see a growing demand for high-performance on-chip bus architectures with the following attributes: small scale, low energy, high security, and highly configurable structures for integration, verification, and performance estimation. Our research thus mainly focuses on addressing these key problems and finding the balance among all these requirements that often work against each other. First of all, we proposed a low-cost and low-power System-on-Chips (SoCs) architecture (IBUS) that can frame data transfers differently. The IBUS protocol provides two novel transfer modes – the block and state modes, and is also backward compatible with the conventional linear mode. In order to evaluate the bus performance automatically and accurately, we also proposed an evaluation methodology based on the standard circuit design flow. Experimental results show that the IBUS based design uses the least hardware resource and reduces energy consumption to a half of an AMBA Advanced High-Performance Bus (AHB) and Advanced eXensible Interface (AXI). Additionally, the valid bandwidth of the IBUS based design is 2.3 and 1.6 times, respectively, compared with the AHB and AXI based implementations. As IoT advances, privacy and security issues become top tier concerns in addition to the high performance requirement of embedded chips. To leverage limited resources for tiny size chips and overhead cost for complex security mechanisms, we further proposed an advanced IBUS architecture to provide a structural support for the block-based AES algorithm. Our results show that the IBUS based AES-encrypted design costs less in terms of hardware resource and dynamic energy (60.2%), and achieves higher throughput (x1.6) compared with AXI. Effectively dealing with the automation in design and verification for mixed-signal integrated circuits is a critical problem, particularly when the bus architecture is new. Therefore, we further proposed a configurable and synthesizable IBUS design methodology. The flexible structure, together with bus wrappers, direct memory access (DMA), AES engine, memory controller, several mixed-signal verification intellectual properties (VIPs), and bus performance models (BPMs), forms the basic for integrated circuit design, allowing engineers to integrate application-specific modules and other peripherals to create complex SoCs

    Advanced analog layout design automation in compliance with density uniformity

    Get PDF
    To fabricate a reliable integrated circuit chip, foundries follow specific design rules and layout processing techniques. One of the parameters, which affect circuit performance and final electronic product quality, is the variation of thickness for each semiconductor layer within the fabricated chips. The thickness is closely dependent on the density of geometric features on that layer. Therefore, to ensure consistent thickness, foundries normally have to seriously control distribution of the feature density on each layer by using post-processing operations. In this research, the methods of controlling feature density distribution on different layers of an analog layout during the process of layout migration from an old technology to a new one or updated design specifications in the same technology have been investigated. We aim to achieve density-uniformity-aware layout retargeting for facilitating manufacturing process in the advanced technologies. This can offer an advantage right to the design stage for the designers to evaluate the effects of applying density uniformity to their drafted layouts, which are otherwise usually done by the foundries at the final manufacturing stage without considering circuit performance. Layout modification for density uniformity includes component position change and size modification, which may induce crosstalk noise caused by extra parasitic capacitance. To effectively control this effect, we have also investigated and proposed a simple yet accurate analytic method to model the parasitic capacitance on multi-layer VLSI chips. Supported by this capacitance modeling research, a unique methodology to deal with density-uniformity-aware analog layout retargeting with the capability of parasitic capacitance control has been presented. The proposed operations include layout geometry position rearrangement, interconnect size modification, and extra dummy fill insertion for enhancing layout density uniformity. All of these operations are holistically coordinated by a linear programming optimization scheme. The experimental results demonstrate the efficacy of the proposed methodology compared to the popular digital solutions in terms of minimum density variation and acute parasitic capacitance control

    Low Power Memory/Memristor Devices and Systems

    Get PDF
    This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

    Hardware / Software Architectural and Technological Exploration for Energy-Efficient and Reliable Biomedical Devices

    Get PDF
    Nowadays, the ubiquity of smart appliances in our everyday lives is increasingly strengthening the links between humans and machines. Beyond making our lives easier and more convenient, smart devices are now playing an important role in personalized healthcare delivery. This technological breakthrough is particularly relevant in a world where population aging and unhealthy habits have made non-communicable diseases the first leading cause of death worldwide according to international public health organizations. In this context, smart health monitoring systems termed Wireless Body Sensor Nodes (WBSNs), represent a paradigm shift in the healthcare landscape by greatly lowering the cost of long-term monitoring of chronic diseases, as well as improving patients' lifestyles. WBSNs are able to autonomously acquire biological signals and embed on-node Digital Signal Processing (DSP) capabilities to deliver clinically-accurate health diagnoses in real-time, even outside of a hospital environment. Energy efficiency and reliability are fundamental requirements for WBSNs, since they must operate for extended periods of time, while relying on compact batteries. These constraints, in turn, impose carefully designed hardware and software architectures for hosting the execution of complex biomedical applications. In this thesis, I develop and explore novel solutions at the architectural and technological level of the integrated circuit design domain, to enhance the energy efficiency and reliability of current WBSNs. Firstly, following a top-down approach driven by the characteristics of biomedical algorithms, I perform an architectural exploration of a heterogeneous and reconfigurable computing platform devoted to bio-signal analysis. By interfacing a shared Coarse-Grained Reconfigurable Array (CGRA) accelerator, this domain-specific platform can achieve higher performance and energy savings, beyond the capabilities offered by a baseline multi-processor system. More precisely, I propose three CGRA architectures, each contributing differently to the maximization of the application parallelization. The proposed Single, Multi and Interleaved-Datapath CGRA designs allow the developed platform to achieve substantial energy savings of up to 37%, when executing complex biomedical applications, with respect to a multi-core-only platform. Secondly, I investigate how the modeling of technology reliability issues in logic and memory components can be exploited to adequately adjust the frequency and supply voltage of a circuit, with the aim of optimizing its computing performance and energy efficiency. To this end, I propose a novel framework for workload-dependent Bias Temperature Instability (BTI) impact analysis on biomedical application results quality. Remarkably, the framework is able to determine the range of safe circuit operating frequencies without introducing worst-case guard bands. Experiments highlight the possibility to safely raise the frequency up to 101% above the maximum obtained with the classical static timing analysis. Finally, through the study of several well-known biomedical algorithms, I propose an approach allowing energy savings by dynamically and unequally protecting an under-powered data memory in a new way compared to regular error protection schemes. This solution relies on the Dynamic eRror compEnsation And Masking (DREAM) technique that reduces by approximately 21% the energy consumed by traditional error correction codes

    Miniature high dynamic range time-resolved CMOS SPAD image sensors

    Get PDF
    Since their integration in complementary metal oxide (CMOS) semiconductor technology in 2003, single photon avalanche diodes (SPADs) have inspired a new era of low cost high integration quantum-level image sensors. Their unique feature of discerning single photon detections, their ability to retain temporal information on every collected photon and their amenability to high speed image sensor architectures makes them prime candidates for low light and time-resolved applications. From the biomedical field of fluorescence lifetime imaging microscopy (FLIM) to extreme physical phenomena such as quantum entanglement, all the way to time of flight (ToF) consumer applications such as gesture recognition and more recently automotive light detection and ranging (LIDAR), huge steps in detector and sensor architectures have been made to address the design challenges of pixel sensitivity and functionality trade-off, scalability and handling of large data rates. The goal of this research is to explore the hypothesis that given the state of the art CMOS nodes and fabrication technologies, it is possible to design miniature SPAD image sensors for time-resolved applications with a small pixel pitch while maintaining both sensitivity and built -in functionality. Three key approaches are pursued to that purpose: leveraging the innate area reduction of logic gates and finer design rules of advanced CMOS nodes to balance the pixel’s fill factor and processing capability, smarter pixel designs with configurable functionality and novel system architectures that lift the processing burden off the pixel array and mediate data flow. Two pathfinder SPAD image sensors were designed and fabricated: a 96 × 40 planar front side illuminated (FSI) sensor with 66% fill factor at 8.25μm pixel pitch in an industrialised 40nm process and a 128 × 120 3D-stacked backside illuminated (BSI) sensor with 45% fill factor at 7.83μm pixel pitch. Both designs rely on a digital, configurable, 12-bit ripple counter pixel allowing for time-gated shot noise limited photon counting. The FSI sensor was operated as a quanta image sensor (QIS) achieving an extended dynamic range in excess of 100dB, utilising triple exposure windows and in-pixel data compression which reduces data rates by a factor of 3.75×. The stacked sensor is the first demonstration of a wafer scale SPAD imaging array with a 1-to-1 hybrid bond connection. Characterisation results of the detector and sensor performance are presented. Two other time-resolved 3D-stacked BSI SPAD image sensor architectures are proposed. The first is a fully integrated 5-wire interface system on chip (SoC), with built-in power management and off-focal plane data processing and storage for high dynamic range as well as autonomous video rate operation. Preliminary images and bring-up results of the fabricated 2mm² sensor are shown. The second is a highly configurable design capable of simultaneous multi-bit oversampled imaging and programmable region of interest (ROI) time correlated single photon counting (TCSPC) with on-chip histogram generation. The 6.48μm pitch array has been submitted for fabrication. In-depth design details of both architectures are discussed
    corecore