7 research outputs found
A Computational Science Agenda for Programming Language Research
Scientific models are often expressed as large and complicated programs. These programs embody numerous assumptions made by the developer (e.g., for differential equations, the discretization strategy and resolution). The complexity and pervasiveness of these assumptions means that often the only true description of the model is the software itself. This has led various researchers to call for scientists to publish their source code along with their papers. We argue that this is unlikely to be beneficial since it is almost impossible to separate implementation assumptions from the original scientific intent. Instead we advocate higher-level abstractions in programming languages, coupled with lightweight verification techniques such as specification and type systems. In this position paper, we suggest several novel techniques and outline an evolutionary approach to applying these to existing and future models. One-dimensional heat flow is used as an example throughout
Open-source development experiences in scientific software: the HANDE quantum Monte Carlo project
The HANDE quantum Monte Carlo project offers accessible stochastic algorithms
for general use for scientists in the field of quantum chemistry. HANDE is an
ambitious and general high-performance code developed by a
geographically-dispersed team with a variety of backgrounds in computational
science. In the course of preparing a public, open-source release, we have
taken this opportunity to step back and look at what we have done and what we
hope to do in the future. We pay particular attention to development processes,
the approach taken to train students joining the project, and how a flat
hierarchical structure aids communicationComment: 6 pages. Submission to WSSSPE
Recommended from our members
Learning units-of-measure from scientific code
CamFort is our multi-purpose tool for lightweight analysis and verification of scientific Fortran code. One core feature provides units-of-measure verification (dimensional analysis) of programs, where users partially annotate programs with units-of-measure from which our tool checks consistency and infers any missing specifications. However, many users find it onerous to provide units-of-measure information for existing code, even in part. We have noted however that there are often many common patterns and clues about the intended units-of-measure contained within variable names, comments, and surrounding code context. In this work-in-progress paper, we describe how we are adapting our approach, leveraging machine-learning techniques to reconstruct units-of-measure information automatically thus saving programmer effort and increasing the likelihood of adoption
Computational science: shifting the focus from tools to models
Computational techniques have revolutionized many aspects of scientific research over the last few decades. Experimentalists use computation for data analysis, processing ever bigger data sets. Theoreticians compute predictions from ever more complex models. However, traditional articles do not permit the publication of big data sets or complex models. As a consequence, these crucial pieces of information no longer enter the scientific record. Moreover, they have become prisoners of scientific software: many models exist only as software implementations, and the data are often stored in proprietary formats defined by the software. In this article, I argue that this emphasis on software tools over models and data is detrimental to science in the long term, and I propose a means by which this can be reversed
Recommended from our members
Verifying Spatial Properties of Array Computations
Arrays computations are at the core of numerical modelling and computational science applications. However, low-level manipulation of array indices is a source of program error. Many practitioners are aware of the need to ensure program correctness, yet very few of the techniques from the programming research community are applied by scientists. We aim to change that by providing targetted lightweight veriication techniques for scientiic code. We focus on the all too common mistake of array ofset errors as a generalisation of of-by-one errors. Firstly, we report on a code analysis study on eleven real-world computational science code base, identifying common idioms of array usage and their spatial properties. This provides much needed data on array programming idioms common in scientiic code. From this data, we designed a lightweight declarative speciication language capturing the majority of array access patterns via a small set of combinators. We detail a semantic model, and the design and implementation of a veriication tool for our speciication language, which both checks and infers speciications. We evaluate our tool on our corpus of scientiic code. Using the inference mode, we found roughly 87,000 targets for speciication across roughly 1.1 million lines of code, showing that the vast majority of array computations read from arrays in a pattern with a simple, regular, static shape. We also studied the commit logs of one of our corpus packages, inding past bug ixes for which our speciication system distinguishes the change and thus could have been applied to detect such bugs.This work was supported by the EPSRC [grant number EP/M026124/1]
Verifying Spatial Properties of Array Computations
Arrays computations are at the core of numerical modelling and computational science applications. However, low-level manipulation of array indices is a source of program error. Many practitioners are aware of the need to ensure program correctness, yet very few of the techniques from the programming research community are applied by scientists. We aim to change that by providing targetted lightweight verification techniques for scientific code. We focus on the all too common mistake of array offset errors as a generalisation of off-by-one errors. Firstly, we report on a code analysis study on eleven real-world computational science code base, identifying common idioms of array usage and their spatial properties. This provides much needed data on array programming idioms common in scientific code. From this data, we designed a lightweight declarative specification language capturing the majority of array access patterns via a small set of combinators. We detail a semantic model, and the design and implementation of a verification tool for our specification language, which both checks and infers specifications. We evaluate our tool on our corpus of scientific code and give verification case studies of bug fixes that are detected by our approach. We found roughly 80,000 targets for specification across roughly 1.4 million lines of code, showing that the vast majority of array computations read from arrays in a pattern with a simple, regular, static shape
On Comparative Algorithmic Pathfinding in Complex Networks for Resource-Constrained Software Agents
Software engineering projects that utilize inappropriate pathfinding algorithms carry a
significant risk of poor runtime performance for customers. Using social network theory,
this experimental study examined the impact of algorithms, frameworks, and map
complexity on elapsed time and computer memory consumption. The 1,800 2D map
samples utilized were computer random generated and data were collected and processed
using Python language scripts. Memory consumption and elapsed time results for each of
the 12 experimental treatment groups were compared using factorial MANOVA to
determine the impact of the 3 independent variables on elapsed time and computer
memory consumption. The MANOVA indicated a significant factor interaction between
algorithms, frameworks, and map complexity upon elapsed time and memory
consumption, F(4, 3576) = 94.09, p \u3c .001, h2 = .095. The main effects of algorithms,
F(4, 3576) = 885.68, p \u3c .001, h2 = .498; and frameworks, F(2, 1787) = 720,360.01, p
.001, h2 = .999; and map complexity, F(2, 1787) = 112,736.40, p \u3c .001, h2 = .992, were
also all significant. This study may contribute to positive social change by providing
software engineers writing software for complex networks, such as analyzing terrorist
social networks, with empirical pathfinding algorithm results. This is crucial to enabling
selection of appropriately fast, memory-efficient algorithms that help analysts identify
and apprehend criminal and terrorist suspects in complex networks before the next attack