In previous work [TOMS, 2004, 30(3), 266--299], we used Markowitz-like heuristics aiming to find elimination sequences that minimise the number of floating-point operations (flops) for vertex elimination Jacobian code. We also used the depth-first traversal algorithm to reorder the statements of the Jacobian code with the aim of reducing the number of memory accesses. In this work, we study the effects of reducing flops or memory accesses within the vertex elimination algorithm for Jacobian calculation. On RISC processors, we observed that for data residing in registers, the number of flops gives a good estimate of the execution time, while for out-of-register data, the execution time is dominated by the time for memory access operations. We also present a statement reordering scheme based on a greedy-list scheduling algorithm using ranking functions. This statement reordering will enable us to trade-off the exploitation of the instruction level parallelism of such processors with the reduction in memory accesses
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.