31,998 research outputs found
DReAM: An approach to estimate per-Task DRAM energy in multicore systems
Accurate per-task energy estimation in multicore systems would allow performing per-task energy-aware task scheduling and energy-aware billing in data centers, among other applications. Per-task energy estimation is challenged by the interaction between tasks in shared resources, which impacts tasks’ energy consumption in uncontrolled ways. Some accurate mechanisms have been devised recently to estimate per-task energy consumed on-chip in multicores, but there is a lack of such mechanisms for DRAM memories. This article makes the case for accurate per-task DRAM energy metering in multicores, which opens new paths to energy/performance optimizations. In particular, the contributions of this article are (i) an ideal per-task energy metering model for DRAM memories; (ii) DReAM, an accurate yet low cost implementation of the ideal model (less than 5% accuracy error when 16 tasks share memory); and (iii) a comparison with standard methods (even distribution and access-count based) proving that DReAM is much more accurate than these other methods.Peer ReviewedPostprint (author's final draft
Recommended from our members
Speeding-up the execution of credit risk simulations using desktop grid computing: A case study
This paper describes a case study that was
undertaken at a leading European Investment
bank in which desktop grid computing was used
to speed-up the execution of Monte Carlo credit risk simulations. The credit risk simulations were modelled using commercial-off-the-shelf simulation packages (CSPs). The CSPs did not incorporate built-in support for desktop grids, and therefore the authors implemented a middleware for desktop grid computing, called WinGrid, and interfaced it with the CSP. The performance results show that WinGrid can speed-up the execution of CSP-based Monte Carlo simulations. However, since WinGrid was installed on non-dedicated PCs, the speed-up
achieved varied according to users’ PC usage.
Finally, the paper presents some lessons learnt from this case study. It is expected that this paper will encourage simulation practitioners and CSP vendors to experiment with desktop grid computing technologies with the objective of speeding-up simulation experimentation
Late allocation and early release of physical registers
The register file is one of the critical components of current processors in terms of access time and power consumption. Among other things, the potential to exploit instruction-level parallelism is closely related to the size and number of ports of the register file. In conventional register renaming schemes, both register allocation and releasing are conservatively done, the former at the rename stage, before registers are loaded with values, and the latter at the commit stage of the instruction redefining the same register, once registers are not used any more. We introduce VP-LAER, a renaming scheme that allocates registers later and releases them earlier than conventional schemes. Specifically, physical registers are allocated at the end of the execution stage and released as soon as the processor realizes that there will be no further use of them. VP-LAER enhances register utilization, that is, the fraction of allocated registers having a value to be read in the future. Detailed cycle-level simulations show either a significant speedup for a given register file size or a reduction in the register file size for a given performance level, especially for floating-point codes, where the register file pressure is usually high.Peer ReviewedPostprint (published version
- …