17,832 research outputs found

    Resource Sharing in Custom Instruction Set Extensions

    Get PDF
    Customised processor performance generally increases as additional custom instructions are added. However, performance is not the only metric that modern systems must take into account; die area and energy efficiency are equally important. Resource sharing during synthesis of instruction set extensions (ISEs) can reduce significantly the die area and energy consumption of a customized processor. This may increase the number of custom instructions that can be synthesized with a given area budget. Resource sharing involves combining the graph representations of two or more ISEs which contain a similar sub-graph. This coupling of multiple sub-graphs, if performed naively, can increase the latency of the extension instructions considerably. And yet, as we show in this paper, an appropriate level of resource sharing provides a significantly simpler design with only modest increases in average latency for extension instructions. Based on existing resource-sharing techniques, this study presents a new heuristic that controls the degree of resource sharing between a given set of custom instructions. Our main contributions are the introduction of a parametric method for exploring the trade-offs that can be achieved between instruction latency and implementation complexity, and the coupling of design-space exploration with fast area-delay models for the operators comprising each ISE. We present experimental evidence that our heuristic exposes a broad range of design points, allowing advantageous trade-offs between die area and latency to be found and exploited

    HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

    Full text link
    Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities. While leading companies successfully advance their HESoC products, research lags behind due to the challenges of building a prototyping platform that unites an industry-standard host processor with an open research PMCA architecture. In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore host processor. The PMCA architecture mapped on the FPGA is silicon-proven, scalable, configurable, and fully modifiable. HERO includes a complete software stack that consists of a heterogeneous cross-compilation toolchain with support for OpenMP accelerator programming, a Linux driver, and runtime libraries for both host and PMCA. HERO is designed to facilitate rapid exploration on all software and hardware layers: run-time behavior can be accurately analyzed by tracing events, and modifications can be validated through fully automated hard ware and software builds and executed tests. We demonstrate the usefulness of HERO by means of case studies from our research

    S-FaaS: Trustworthy and Accountable Function-as-a-Service using Intel SGX

    Full text link
    Function-as-a-Service (FaaS) is a recent and already very popular paradigm in cloud computing. The function provider need only specify the function to be run, usually in a high-level language like JavaScript, and the service provider orchestrates all the necessary infrastructure and software stacks. The function provider is only billed for the actual computational resources used by the function invocation. Compared to previous cloud paradigms, FaaS requires significantly more fine-grained resource measurement mechanisms, e.g. to measure compute time and memory usage of a single function invocation with sub-second accuracy. Thanks to the short duration and stateless nature of functions, and the availability of multiple open-source frameworks, FaaS enables non-traditional service providers e.g. individuals or data centers with spare capacity. However, this exacerbates the challenge of ensuring that resource consumption is measured accurately and reported reliably. It also raises the issues of ensuring computation is done correctly and minimizing the amount of information leaked to service providers. To address these challenges, we introduce S-FaaS, the first architecture and implementation of FaaS to provide strong security and accountability guarantees backed by Intel SGX. To match the dynamic event-driven nature of FaaS, our design introduces a new key distribution enclave and a novel transitive attestation protocol. A core contribution of S-FaaS is our set of resource measurement mechanisms that securely measure compute time inside an enclave, and actual memory allocations. We have integrated S-FaaS into the popular OpenWhisk FaaS framework. We evaluate the security of our architecture, the accuracy of our resource measurement mechanisms, and the performance of our implementation, showing that our resource measurement mechanisms add less than 6.3% latency on standardized benchmarks

    ASAM : Automatic Architecture Synthesis and Application Mapping; dl. 3.2: Instruction set synthesis

    Get PDF
    No abstract

    An assessment of the connection machine

    Get PDF
    The CM-2 is an example of a connection machine. The strengths and problems of this implementation are considered as well as important issues in the architecture and programming environment of connection machines in general. These are contrasted to the same issues in Multiple Instruction/Multiple Data (MIMD) microprocessors and multicomputers
    • …
    corecore