3 research outputs found
On the State and Importance of Reproducible Experimental Research in Parallel Computing
Computer science is also an experimental science. This is particularly the
case for parallel computing, which is in a total state of flux, and where
experiments are necessary to substantiate, complement, and challenge
theoretical modeling and analysis. Here, experimental work is as important as
are advances in theory, that are indeed often driven by the experimental
findings. In parallel computing, scientific contributions presented in research
articles are therefore often based on experimental data, with a substantial
part devoted to presenting and discussing the experimental findings. As in all
of experimental science, experiments must be presented in a way that makes
reproduction by other researchers possible, in principle. Despite appearance to
the contrary, we contend that reproducibility plays a small role, and is
typically not achieved. As can be found, articles often do not have a
sufficiently detailed description of their experiments, and do not make
available the software used to obtain the claimed results. As a consequence,
parallel computational results are most often impossible to reproduce, often
questionable, and therefore of little or no scientific value. We believe that
the description of how to reproduce findings should play an important part in
every serious, experiment-based parallel computing research article. We aim to
initiate a discussion of the reproducibility issue in parallel computing, and
elaborate on the importance of reproducible research for (1) better and sounder
technical/scientific papers, (2) a sounder and more efficient review process
and (3) more effective collective work. This paper expresses our current view
on the subject and should be read as a position statement for discussion and
future work. We do not consider the related (but no less important) issue of
the quality of the experimental design
A Survey on Reproducibility in Parallel Computing
We summarize the results of a survey on reproducibility in parallel
computing, which was conducted during the Euro-Par conference in August 2015.
The survey form was handed out to all participants of the conference and the
workshops. The questionnaire, which specifically targeted the parallel
computing community, contained questions in four different categories: general
questions on reproducibility, the current state of reproducibility, the
reproducibility of the participants' own papers, and questions about the
participants' familiarity with tools, software, or open-source software
licenses used for reproducible research.Comment: 15 pages, 24 figure
Toward Enabling Reproducibility for Data-Intensive Research using the Whole Tale Platform
Whole Tale http://wholetale.org is a web-based, open-source platform for
reproducible research supporting the creation, sharing, execution, and
verification of "Tales" for the scientific research community. Tales are
executable research objects that capture the code, data, and environment along
with narrative and workflow information needed to re-create computational
results from scientific studies. Creating reproducible research objects that
enable reproducibility, transparency, and re-execution for computational
experiments requiring significant compute resources or utilizing massive data
is an especially challenging open problem. We describe opportunities,
challenges, and solutions to facilitating reproducibility for data- and
compute-intensive research, that we call "Tales at Scale," using the Whole Tale
computing platform. We highlight challenges and solutions in frontend
responsiveness needs, gaps in current middleware design and implementation,
network restrictions, containerization, and data access. Finally, we discuss
challenges in packaging computational experiment implementations for portable
data-intensive Tales and outline future work