2 research outputs found
Analysis and transformation of legacy code
Hardware evolves faster than software. While a hardware system might need replacement
every one to five years, the average lifespan of a software system is a decade,
with some instances living up to several decades. Inevitably, code outlives the platform
it was developed for and may become legacy: development of the software stops,
but maintenance has to continue to keep up with the evolving ecosystem. No new features
are added, but the software is still used to fulfil its original purpose. Even in the
cases where it is still functional (which discourages its replacement), legacy code is
inefficient, costly to maintain, and a risk to security.
This thesis proposes methods to leverage the expertise put in the development of
legacy code and to extend its useful lifespan, rather than to throw it away. A novel
methodology is proposed, for automatically exploiting platform specific optimisations
when retargeting a program to another platform. The key idea is to leverage the optimisation
information embedded in vector processing intrinsic functions. The performance
of the resulting code is shown to be close to the performance of manually
retargeted programs, however with the human labour removed.
Building on top of that, the question of discovering optimisation information when
there are no hints in the form of intrinsics or annotations is investigated. This thesis
postulates that such information can potentially be extracted from profiling the data
flow during executions of the program. A context-aware data dependence profiling
system is described, detailing previously overlooked aspects in related research. The
system is shown to be essential in surpassing the information that can be inferred statically,
in particular about loop iterators.
Loop iterators are the controlling part of a loop. This thesis describes and evaluates
a system for extracting the loop iterators in a program. It is found to significantly
outperform previously known techniques and further increases the amount of information
about the structure of a program that is available to a compiler. Combining this
system with data dependence profiling improves its results even more. Loop iterator
recognition enables other code modernising techniques, like source code rejuvenation
and commutativity analysis. The former increases the use of idiomatic code and as
a result increases the maintainability of the program. The latter can potentially drive
parallelisation and thus dramatically improve runtime performance