HPC machines are introducing more and more heterogeneity in their
architecture on the road to exascale systems. The increasing complexity of
the machines due to the variety of hardware architectures and accelerators
makes efficient programming a task harder than ever. Heterogeneous parallel
programming models, such as OmpSs@FPGA, help the programmer
handle the most unfriendly parts of working with accelerators.
This master thesis analyzes the OmpSs@FPGA communication system
and proposes a set of techniques to overcome the problems related to it
and potentially improve the performance of the applications.
The results show that the techniques proposed speed up the applications
under certain conditions and, most importantly, solves some of the
limitations that had the previous communication system. In particular, the
new techniques specially improve the explotation of fine-grain parallelism
and open the door to explore new possibilities with regard to data communication
and re-use.
Moreover, a tool (autoVivado) that automatically manages the process
of bitstream generation, from the synthesis of the HLS code to the generation
of the device-tree, has been developed as part of this master thesis.
autoVivado has been fully integrated with the OmpSs@FPGA compiler infrastructure,
providing the programmers a way to transparently generate
parallel heterogenous programs and bitstreams from OmpSs applications
that use FPGA accelerators