2 research outputs found

    Adding tightly-integrated task scheduling acceleration to a RISC-V multi-core processor

    Get PDF
    Task Parallelism is a parallel programming model that provides code annotation constructs to outline tasks and describe how their pointer parameters are accessed so that they might be executed in parallel, and asynchronously, by a runtime capable of inferring and honoring their data dependence relationships. It is supported by several parallelization frameworks, as OpenMP and StarSs. Overhead related to automatic dependence inference and to the scheduling of ready-to-run tasks is a major performance limiting factor of Task Parallel systems. To amortize this overhead, programmers usually trade the higher parallelism that could be leveraged from finer-grained work partitions for the higher runtime-efficiency of coarser-grained work partitions. Such problems are even more severe for systems with many cores, as the task spawning frequency required for preserving cores from starvation grows linearly with their number. To mitigate these problems, researchers have designed hardware accelerators to improve runtime performance. Nevertheless, the high CPU-accelerator communication overheads of these solutions hampered their gains. We thus propose a RISC-V based architecture that minimizes communication overhead between the HW Task Scheduler and the CPU by allowing Task Scheduling software to directly interact with the former through custom instructions. Empirical evaluation of the architecture is made possible by an FPGA prototype featuring an eight-core Linux-capable Rocket Chip implementing such instructions. To evaluate the prototype performance, we both (1) adapted Nanos, a mature Task Scheduling runtime, to benefit from the new task-scheduling-accelerating instructions; and (2) developed Phentos, a new HW-accelerated light weight Task Scheduling runtime. Our experiments show that task parallel programs using Nanos-RV --- the Nanos version ported to our system --- are on average 2.13 times faster than those being serviced by baseline Nanos, while programs running on Phentos are 13.19 times faster, considering geometric means. Using eight cores, Nanos-RV is able to deliver speedups with respect to serial execution of up to 5.62 times, while Phentos produces speedups of up to 5.72 times.This work was supported by the Spanish Government (projects SEV-2015-0493 and TIN2015-65316-P), the Generalitat de Catalunya (2017-SGR-1414 and 2017-SGR1328), FAPESP (grants 2017/02682-2, 2018/00687-0, and 2014/25694-8), CNPq (grant 408782/2016-1), and CAPES.Peer ReviewedPostprint (author's final draft

    Analysis on the Possibility of RISC-V Adoption

    Get PDF
    As the interface between hardware and software, Instruction Set Architectures (ISAs) play a key role in the operation of computers. While both hardware and software have continued to evolve rapidly over time, ISAs have undergone minimal change. Since its release in 2010, RISC-V has begun to erode the industry aversion to ISA innovation. Established on the principals of the Reduced Instruction Set Computer (RISC), and as an open source ISA, RISC-V offers many benefits over popular ISAs like Intel’s x86 and Arm Holding’s Advanced RISC Machine (ARM). In this literature review I evaluate the literature discussing: What makes changing Instruction Set Architectures difficultWhy might the industry choose to implement RISC-V When researching this topic I visited the IEEE (Institute of Electrical and Electronics Engineers), INSPEC (Engineering Village), and ACM (Association for Computing Machinery) Digital Library databases. I used the search terms, “RISC-V”, “Instruction Set Architecture”, “RISC-V” AND “x86”, and “RISC-V” AND “Instruction Set Architecture”. This literature review evaluates 10 papers on implementation of RISC-V. As this paper was intended to cover recent developments in the field, publication dates were limited to from 2015 to present
    corecore