One of the biggest challenges in multicore platforms is shared cache management, especially for data dominant
applications. Two commonly used approaches for increasing shared cache utilization are cache partitioning
and loop tiling. However, state-of-the-art compilers lack of efficient cache partitioning and loop tiling
methods for two reasons. First, cache partitioning and loop tiling are strongly coupled together, thus addressing
them separately is simply not effective. Second, cache partitioning and loop tiling must be tailored
to the target shared cache architecture details and the memory characteristics of the co-running workloads.
To the best of our knowledge, this is the first time that a methodology provides i) a theoretical foundation
in the above mentioned cache management mechanisms and ii) a unified framework to orchestrate these
two mechanisms in tandem (not separately). Our approach manages to lower the number of main memory
accesses by an order of magnitude keeping at the same time the number of arithmetic/addressing instructions
in a minimal level. We motivate this work by showcasing that cache partitioning, loop tiling, data
array layouts, shared cache architecture details (i.e., cache size and associativity) and the memory reuse
patterns of the executing tasks must be addressed together as one problem, when a (near)- optimal solution
is requested. To this end, we present a search space exploration analysis where our proposal is able to offer
a vast deduction in the required search space