34,794 research outputs found

    Leveraging register windows to reduce physical registers to the bare minimum

    Get PDF
    Register window is an architectural technique that reduces memory operations required to save and restore registers across procedure calls. Its effectiveness depends on the size of the register file. Such register requirements are normally increased for out-of-order execution because it requires registers for the in-flight instructions, in addition to the architectural ones. However, a large register file has an important cost in terms of area and power and may even affect the cycle time. In this paper, we propose a software/hardware early register release technique that leverage register windows to drastically reduce the register requirements, and hence, reduce the register file cost. Contrary to the common belief that out-of-order processors with register windows would need a large physical register file, this paper shows that the physical register file size may be reduced to the bare minimum by using this novel microarchitecture. Moreover, our proposal has much lower hardware complexity than previous approaches, and requires minimal changes to a conventional register window scheme. Performance studies show that the proposed technique can reduce the number of physical registers to the number of logical registers plus one (minimum number to guarantee forward progress) and still achieve almost the same performance as an unbounded register file.Peer ReviewedPostprint (published version

    NanoMagnetic Logic Microprocessor Hierarchical Power Model

    Get PDF
    The interest on emerging nanotechnologies has been recently focused on NanoMagnetic Logic (NML), which has unique appealing features. NML circuits have a very low power consumption and, due to their magnetic nature, they maintain the information safely stored even without power supply. The nature of these circuits is highly different from the CMOS ones. As a consequence, to better understand NML logic, complex circuits and not only simple gates must be designed. This constraint calls for a new design and simulation methodology. It should efficiently encompass manifold properties: 1) being based on commonly used hardware description language (HDL) in order to easily manage complexity and hierarchy; 2) maintaining a clear link with physical characteristics 3) modeling performance aspects like speed and power, together with logic behavior. In this contribution we present a VHDL behavioral model for NML circuits, which allows to evaluate not only logic behavior but also power dissipation. It is based on a technological solution called ``snake-clock''. We demonstrate this model on a case study which offers the right variety of internal substructures to test the method: a four bit microprocessor designed using asynchronous logic. The model enables a hierarchical bottom-up evaluation of the processor logic behavior, area and power dissipation, which we evaluated using as benchmark a division algorithm. Results highlight the flexibility and the efficiency of this model, and the remarkable improvements that it brings to the analysis of NML circuit

    DR.SGX: Hardening SGX Enclaves against Cache Attacks with Data Location Randomization

    Full text link
    Recent research has demonstrated that Intel's SGX is vulnerable to various software-based side-channel attacks. In particular, attacks that monitor CPU caches shared between the victim enclave and untrusted software enable accurate leakage of secret enclave data. Known defenses assume developer assistance, require hardware changes, impose high overhead, or prevent only some of the known attacks. In this paper we propose data location randomization as a novel defensive approach to address the threat of side-channel attacks. Our main goal is to break the link between the cache observations by the privileged adversary and the actual data accesses by the victim. We design and implement a compiler-based tool called DR.SGX that instruments enclave code such that data locations are permuted at the granularity of cache lines. We realize the permutation with the CPU's cryptographic hardware-acceleration units providing secure randomization. To prevent correlation of repeated memory accesses we continuously re-randomize all enclave data during execution. Our solution effectively protects many (but not all) enclaves from cache attacks and provides a complementary enclave hardening technique that is especially useful against unpredictable information leakage

    Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

    Get PDF
    The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research
    corecore