The latest generation of radio astronomy interferometers will conduct all sky
surveys with data products consisting of petabytes of spectral line data.
Traditional approaches to identifying and parameterising the astrophysical
sources within this data will not scale to datasets of this magnitude, since
the performance of workstations will not keep up with the real-time generation
of data. For this reason, it is necessary to employ high performance computing
systems consisting of a large number of processors connected by a
high-bandwidth network. In order to make use of such supercomputers substantial
modifications must be made to serial source finding code. To ease the
transition, this work presents the Scalable Source Finder Framework, a
framework providing storage access, networking communication and data
composition functionality, which can support a wide range of source finding
algorithms provided they can be applied to subsets of the entire image.
Additionally, the Parallel Gaussian Source Finder was implemented using SSoFF,
utilising Gaussian filters, thresholding, and local statistics. PGSF was able
to search on a 256GB simulated dataset in under 24 minutes, significantly less
than the 8 to 12 hour observation that would generate such a dataset.Comment: 15 pages, 6 figure