6 research outputs found

    Layout Transformations for Heap Objects Using Static Access Patterns

    No full text
    As the amount of data used by programs increases due to the growth of the hardware storage capacity, efficient memory usage is a key factor to performance. Since modern applications heavily use structures allocated in the heap, this paper particularly concentrates on the optimizations of those structures using compile-time analyses. To make optimization procedures entirely static, we propose a novel approach to represent memory access patterns with regular expressions. Repetitive accesses are usually important information for locality optimizations. The expressive power of regular expressions is appropriate to denote those repetitive accesses along with various access patterns according to the control flow of programs. By interpreting statically calculated access patterns, we choose appropriate structures for pool allocation and reorganize the field layouts of the chosen structures as well. To verify the effect of our static optimizations, we implement our analyses and optimizations based on a field restructuring scheme. Our experiments with the Olden benchmarks demonstrate that the layout transformation scheme for heap objects dramatically improves cache locality by 36 % and performance by 31%

    Restructuring field layouts for embedded memory systems

    No full text
    In many computer systems with large data computations, the delay of memory access is one of the major performance bottlenecks. In this paper, we propose an enhanced field remapping scheme for dynamically allocated structures in order to provide better locality than conventional field layouts. Our proposed scheme reduces cache miss rates drastically by aggregating and grouping fields from multiple instances of the same structure, which implies the performance improvement and power reduction. Our methodology will become more important in the design space exploration, especially as the embedded systems for data oriented application become prevalent. Experimental results show that average L1 and L2 data cache misses are reduced by 23 % and 17%, respectively. Due to the enhanced localities, our remapping achieves 13 % faster execution time on average than original programs. It also reduces power consumption by 18 % for data cache.
    corecore