Methods developed at the Texas Advanced Computing Center (TACC) are described
and demonstrated for automating the construction of an elastic, virtual cluster
emulating the Stampede2 high performance computing (HPC) system. The cluster
can be built and/or scaled in a matter of minutes on the Jetstream self-service
cloud system and shares many properties of the original Stampede2, including:
i) common identity management, ii) access to the same file systems, iii)
equivalent software application stack and module system, iv) similar job
scheduling interface via Slurm.
We measure time-to-solution for a number of common scientific applications on
our virtual cluster against equivalent runs on Stampede2 and develop an
application profile where performance is similar or otherwise acceptable. For
such applications, the virtual cluster provides an effective form of "cloud
bursting" with the potential to significantly improve overall turnaround time,
particularly when Stampede2 is experiencing long queue wait times. In addition,
the virtual cluster can be used for test and debug without directly impacting
Stampede2. We conclude with a discussion of how science gateways can leverage
the TACC Jobs API web service to incorporate this cloud bursting technique
transparently to the end user.Comment: 6 pages, 0 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US