22 research outputs found

    Exploring iGPU Memory Interference Response to L2 Cache Locking

    Get PDF

    Task and Memory Mapping Optimization for SDRAM Interference Minimization on Heterogeneous MPSoCs

    Get PDF
    DDR SDRAM memories are resources commonly used on multicore platforms and hence, being a main source of interference. To deal with this issue, we propose a methodology based on task/memory mapping optimization through multi-objective heuristic-based algorithms. By placing the tasks on the platform cores and the memory in the DDR SDRAM banks, we minimize the DDR SDRAM interference while considering other aspects such as the task execution parallelism and deadline margin. To evaluate the fitness of the task/memory map, the optimization algorithms make use of cost function equations. In order to compute the DDR memory interference cost, we use a fast executing self-designed cost function. The execution parallelism is computed using the workload variance cost function. The deadline margin of a task is computed considering the inter and intra core interference. The task/memory mapping outcomes are checked through tests for which the heterogeneous MPSoCs Keystone II and Sitara AM5728 are used. To assure certification, the WCET constraints of the resulting near-optimal Pareto solutions are verified through formally validated bounding frameworks

    Heterogeneous multicore SDRAM interference analysis

    Get PDF
    The purpose of this paper is to describe a set of DDR3 SDRAM interference estimation cost functions. The arbitration system of the SDRAM controller heavily impact the interference analysis. In this work, three arbitration are considered, corresponding to the situations where the accessed memory address belongs to the same block address, different memory banks and different rows. The aim of these functions is to estimate the instructions interference overhead may suffer when concurrently accessing these three logical addresses in a SDRAM saturation context. To develop these interference expressions, specific measurement systems, micro-benchmarks and theory on SDRAM controllers have been used

    Towards an efficient cost function equation for DDR SDRAM interference analysis on heterogeneous MPSoCs

    Get PDF
    Real-time applications must finish their execution within an imposed deadline to function correctly. DDR memory interference on multicore platforms can make tasks overpass their respective deadline, leading to critical errors. Bandwidth regulators and SDRAM bank partitioning are examples of techniques used to mitigate or avoid this interference type. Another possibility is to optimally place tasks and memory on the platform, i.e., task/memory mapping optimization. The algorithms used for finding optimal mapping solutions work using a cost function that indicates the fitness of the found solution. In this work, we propose a DDR SDRAM cost function that estimates the worst-case execution time for a giving map, and hence, implementable in an optimization algorithm. Our cost function considers the DDR memory device operation, the SoC manufacturer memory controller, the heterogeneity of the platform and the characteristics of the tasks to map. The cost function is evaluated by measuring directly the interference from the heterogeneous MPSoCs Keystone II and Sitara AM5728 by Texas Instruments

    Optimisation des transferts de données pour le traitement du signal (Pavage, fusion et reallocation des tableaux)

    No full text
    PARIS-MINES ParisTech (751062310) / SudocFONTAINEBLEAU-MINES ParisTech (771862302) / SudocSudocFranceF

    Buffered Tiling for Sequences of Loop Nests

    No full text
    Usually tiling is applied to one loop nest at a time. In this paper we apply tiling and fusion simultaneously to a sequence of parallel nested loops in order to minimize data movements and energy consumption and/or to maximize the speed of execution. Each of these nests uses as input a stencil of data computed in a previous nest. After fusion and tiling, we guarantee that data necessary to the execution of an iteration has been already computed by the previous iterations by delaying the computation of consumer nest. We take into account the relation among the various stencils, the added delays and the tiling parameters and we give a solution for a class of tiling. To store only the live data elements, we compute the surface of these data for every array and during the code generation we replace this array by a buffer whose size is equal to the surface of live data. We measured cache misses for the transformed versions of the example program

    Multicore shared memory interference analysis through hardware performance counters

    No full text
    International audienceThe aim of this paper is to present a high precision and event-versatile MBPTA framework that we have developed for the statistical timing analysis of multicore platforms. Its use satisfactorily allows the study of complex multicore platforms from the CPU point of view, without requiring hardware or software models. This gives us an accurate real view of the platform behavior for any specific situation without using extra tools. In addition, this measurement framework is directly portable to other multicore platforms with the same CPU version and easily portable to other CPU versions within the same manufacturer.The MBPTA framework directly uses coprocessors and the Performance Monitor Unit (PMU), i.e. Performance Monitor Hardware (PMH), instead of software profilers. Hardware performance counters provide low-overhead access to a considerable amount of performance information of numerous elements such as the CPU, caches or bus.The statistical timing analysis consists in proposing average and worst-case modeling by making use of the tool diagXtrm applied to measurement of task execution times.Measurements obtained from the PMH are used for analyzing and quantifying the interference that can happen within a multicore platform.The potential for measurements from coprocessor and PMU, as well as its potential for statistical analysis, is shown by using an heterogeneous multicore Texas Instrument system on chip. The interference we focus on are due to the shared memory of this platform
    corecore