4 research outputs found
Modulo scheduling with reduced register pressure
Software pipelining is a scheduling technique that is used by some product compilers in order to expose more instruction level parallelism out of innermost loops. Module scheduling refers to a class of algorithms for software pipelining. Most previous research on module scheduling has focused on reducing the number of cycles between the initiation of consecutive iterations (which is termed II) but has not considered the effect of the register pressure of the produced schedules. The register pressure increases as the instruction level parallelism increases. When the register requirements of a schedule are higher than the available number of registers, the loop must be rescheduled perhaps with a higher II. Therefore, the register pressure has an important impact on the performance of a schedule. This paper presents a novel heuristic module scheduling strategy that tries to generate schedules with the lowest II, and, from all the possible schedules with such II, it tries to select that with the lowest register requirements. The proposed method has been implemented in an experimental compiler and has been tested for the Perfect Club benchmarks. The results show that the proposed method achieves an optimal II for at least 97.5 percent of the loops and its compilation time is comparable to a conventional top-down approach, whereas the register requirements are lower. In addition, the proposed method is compared with some other existing methods. The results indicate that the proposed method performs better than other heuristic methods and almost as well as linear programming methods, which obtain optimal solutions but are impractical for product compilers because their computing cost grows exponentially with the number of operations in the loop body.Peer ReviewedPostprint (published version
Recommended from our members
Fine grain software pipelining of non-vectorizable nested loops
This paper presents a new technique to parallelize nested loops at the statement level. It transforms sequential nested loops, either vectorizable or not, into parallel ones. Previously, the wavefront method was used to parallelize non-vectorizable nested loops. However, in order to reduce the complexity of parallelization, the wavefront method regards an iteration as an unbreakable scheduling unit and draws parallelism through iteration overlapping. Our technique takes a statement rather than an iteration as the scheduling unit and exploits parallelism by overlapping the statements in all dimensions. In this paper, we show how this finer grain parallelization can be achieved with reasonable computational complexity, and the effectiveness of the resulting method in exploiting parallelism
Software Pipelining in the LLVM Compiler
Tahle práce pojednává o návrhu a implementaci techniky programovĂ©ho zĹ™etÄ›zenĂ aneb Software pipelining, optimalizaci cyklĹŻ v programu, která se snažà plnÄ› vyuĹľĂt paralelismus na Ăşrovni instrukcĂ. To dosahuje plánovanĂm instrukcĂ zpĹŻsobem, aby se jednotlivĂ© iterace cyklu pĹ™ekrĂ˝valy a bylo je moĹľnĂ© vykonávat zĹ™etÄ›zenÄ›. Optimalizace takhle zvyšuje rychlost vĂ˝slednĂ©ho programu. Je tu popsanĂ˝ návrh a implementace algoritmu Swing Modulo Scheduling, efektivnĂ metody pro nacházenĂ optimálnĂho plánu pro zĹ™etÄ›zenĂ cyklĹŻ. Práce byla vytvoĹ™ena jako součást vÄ›tšĂho projektu a to vĂ˝voje Codasip Framework. Jeho součástĂ je pĹ™ekladaÄŤ jazyka C do jazyka symbolickĂ˝ch instrukcĂ vytvoĹ™enĂ˝ nad pĹ™ekladaÄŤovou architekturou LLVM. V tomto pĹ™ekladaÄŤi je implementován vĂ˝sledek tĂ©to práce.This thesis discusses a design and implementation of the Software Pipelining, a optimization technique of loops in a program, which tries to exploit instruction-level parallelism. It is achieved by scheduling instructions in a way to overlap iterations of the loop and therefore execute them in a pipeline. This way optimization speeds up the final program. There is a detailed description of design and implementation of Swing Modulo Scheduling algorithm, an effective and efficient method for finding near-optimal plans for software-pipelined loops. This work has been done as a part of a larger project, the development of Codasip Framework. Part of this framework is the retargetable C compiler based on compiler architecture LLVM, in which this work is implemented.