45 research outputs found

    Multicore Performance Prediction with MPET : Using Scalability Characteristics for Statistical Cross-Architecture Prediction

    Get PDF
    Multicore processors serve as target platforms in a broad variety of applications ranging from high-performance computing to embedded mobile computing and automotive applications. But, the required parallel programming opens up a huge design space of parallelization strategies each with potential bottlenecks. Therefore, an early estimation of an application’s performance is a desirable development tool. However, out-of-order execution, superscalar instruction pipelines, as well as communication costs and (shared-) cache effects essentially influence the performance of parallel programs. While offering low modeling effort and good simulation speed, current approximate analytic models provide moderate prediction results so far. Virtual prototyping requires a time-consuming simulation, but produces better accuracy. Furthermore, even existing statistical methods often require detailed knowledge of the hardware for characterization. In this work, we present a concept called Multicore Performance Evaluation Tool (MPET) and its evaluation for a statistical approach for performance prediction based on abstract runtime parameters, which describe an application’s scalability behavior and can be extracted from profiles without user input. These scalability parameters not only include information on the interference of software demands and hardware capabilities, but indicate bottlenecks as well. Depending on the database setup, we achieve a competitive accuracy of 20% mean prediction error (11% median), which we also demonstrate in a case study

    Simulation Environment with Customized RISC-V Instructions for Logic-in-Memory Architectures

    Get PDF
    Nowadays, various memory-hungry applications like machine learning algorithms are knocking "the memory wall". Toward this, emerging memories featuring computational capacity are foreseen as a promising solution that performs data process inside the memory itself, so-called computation-in-memory, while eliminating the need for costly data movement. Recent research shows that utilizing the custom extension of RISC-V instruction set architecture to support computation-in-memory operations is effective. To evaluate the applicability of such methods further, this work enhances the standard GNU binary utilities to generate RISC-V executables with Logic-in-Memory (LiM) operations and develop a new gem5 simulation environment, which simulates the entire system (CPU, peripherals, etc.) in a cycle-accurate manner together with a user-defined LiM module integrated into the system. This work provides a modular testbed for the research community to evaluate potential LiM solutions and co-designs between hardware and software
    corecore