2 research outputs found
Measuring Thread Timing to Assess the Feasibility of Early-bird Message Delivery
Early-bird communication is a communication/computation overlap technique
that combines fine-grained communication with partitioned communication to
improve application run-time. Communication is divided among the compute
threads such that each individual thread can initiate transmission of its
portion of the data as soon as it is complete rather than waiting for all of
the threads. However, the benefit of early-bird communication depends on the
completion timing of the individual threads. In this paper, we measure and
evaluate the potential overlap, the idle time each thread experiences between
finishing their computation and the final thread finishing. These measurements
help us understand whether a given application could benefit from early-bird
communication. We present our technique for gathering this data and evaluate
data collected from three proxy applications: MiniFE, MiniMD, and MiniQMC. To
characterize the behavior of these workloads, we study the thread timings at
both a macro level, i.e., across all threads across all runs of an application,
and a micro level, i.e., within a single process of a single run. We observe
that these applications exhibit significantly different behavior. While MiniFE
and MiniQMC appear to be well-suited for early-bird communication because of
their wider thread distribution and more frequent laggard threads, the behavior
of MiniMD may limit its ability to leverage early-bird communication