13,362 research outputs found

    Nanowire Volatile RAM as an Alternative to SRAM

    Full text link
    Maintaining benefits of CMOS technology scaling is becoming challenging due to increased manufacturing complexities and unwanted passive power dissipations. This is particularly challenging in SRAM, where manufacturing precision and leakage power control are critical issues. To alleviate some of these challenges a novel non-volatile memory alternative to SRAM was proposed called nanowire volatile RAM (NWRAM). Due to NWRAMs regular grid based layout and innovative circuit style, manufacturing complexity is reduced and at the same time considerable benefits are attained in terms of performance and leakage power reduction. In this paper, we elaborate more on NWRAM circuit aspects and manufacturability, and quantify benefits at 16nm technology node through simulation against state-of-the-art 6T-SRAM and gridded 8T-SRAM designs. Our results show the 10T-NWRAM to be 2x faster and 35x better in terms of leakage when compared to high performance gridded 8T-SRAM design

    A methodology for analyzing commercial processor performance numbers

    Get PDF
    The wealth of performance numbers provided by benchmarking corporations makes it difficult to detect trends across commercial machines. A proposed methodology, based on statistical data analysis, simplifies exploration of these machines' large datasets

    Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add

    Get PDF
    The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft

    Towards more accurate real time testing

    Get PDF
    The languages Message Sequence Charts (MSC) [1], System Design Language1 (SDL) [2] and Testing and Test Control Notation Testing2 (TTCN-3) [3] have been developed for the design, modelling and testing of complex software systems. These languages have been developed to complement one another in the software development process. Each of these languages has features for describing, analysing or testing the real time properties of systems. Robust toolsets exist which provide integrated environments for the design, analysis and testing of systems, and it is claimed, for the complete development of real time systems. It was shown in [4] however, that there are fundamental problems with the SDL language and its associated tools for modelling and reasoning about real time systems. In this paper we present the limitations of TTCN-3 and propose recommendations which help minimise the timing inaccuracies that would otherwise occur in using the language directly

    Cloud Benchmarking for Performance

    Get PDF
    How can applications be deployed on the cloud to achieve maximum performance? This question has become significant and challenging with the availability of a wide variety of Virtual Machines (VMs) with different performance capabilities in the cloud. The above question is addressed by proposing a six step benchmarking methodology in which a user provides a set of four weights that indicate how important each of the following groups: memory, processor, computation and storage are to the application that needs to be executed on the cloud. The weights along with cloud benchmarking data are used to generate a ranking of VMs that can maximise performance of the application. The rankings are validated through an empirical analysis using two case study applications; the first is a financial risk application and the second is a molecular dynamics simulation, which are both representative of workloads that can benefit from execution on the cloud. Both case studies validate the feasibility of the methodology and highlight that maximum performance can be achieved on the cloud by selecting the top ranked VMs produced by the methodology.Comment: 6 pages, 6th IEEE International Conference on Cloud Computing Technology and Science (IEEE CloudCom) 2014, Singapor
    • …
    corecore