21 research outputs found

    Simulation modelling and visualisation: toolkits for building artificial worlds

    Get PDF
    Simulations users at all levels make heavy use of compute resources to drive computational simulations for greatly varying applications areas of research using different simulation paradigms. Simulations are implemented in many software forms, ranging from highly standardised and general models that run in proprietary software packages to ad hoc hand-crafted simulations codes for very specific applications. Visualisation of the workings or results of a simulation is another highly valuable capability for simulation developers and practitioners. There are many different software libraries and methods available for creating a visualisation layer for simulations, and it is often a difficult and time-consuming process to assemble a toolkit of these libraries and other resources that best suits a particular simulation model. We present here a break-down of the main simulation paradigms, and discuss differing toolkits and approaches that different researchers have taken to tackle coupled simulation and visualisation in each paradigm

    Faster inference from state space models via GPU computing

    Get PDF
    Funding: C.F.-J. is funded via a doctoral scholarship from the University of St Andrews, School of Mathematics and Statistics.Inexpensive Graphics Processing Units (GPUs) offer the potential to greatly speed up computation by employing their massively parallel architecture to perform arithmetic operations more efficiently. Population dynamics models are important tools in ecology and conservation. Modern Bayesian approaches allow biologically realistic models to be constructed and fitted to multiple data sources in an integrated modelling framework based on a class of statistical models called state space models. However, model fitting is often slow, requiring hours to weeks of computation. We demonstrate the benefits of GPU computing using a model for the population dynamics of British grey seals, fitted with a particle Markov chain Monte Carlo algorithm. Speed-ups of two orders of magnitude were obtained for estimations of the log-likelihood, compared to a traditional ‘CPU-only’ implementation, allowing for an accurate method of inference to be used where this was previously too computationally expensive to be viable. GPU computing has enormous potential, but one barrier to further adoption is a steep learning curve, due to GPUs' unique hardware architecture. We provide a detailed description of hardware and software setup, and our case study provides a template for other similar applications. We also provide a detailed tutorial-style description of GPU hardware architectures, and examples of important GPU-specific programming practices.Publisher PDFPeer reviewe

    VERCE delivers a productive e-Science environment for seismology research

    Get PDF
    The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the VERCE science gateway that makes it convenient for seismologists to use these resources from any location via the Internet. Their data handling is made flexible and scalable by two Python libraries, ObsPy and dispel4py and by data services delivered by ORFEUS and EUDAT. Provenance driven tools enable rapid exploration of results and of the relationships between data, which accelerates understanding and method improvement. These powerful facilities are integrated and draw on many other e-Infrastructures. This paper presents the motivation for building such systems, it reviews how solid-Earth scientists can make significant research progress using them and explains the architecture and mechanisms that make their construction and operation achievable. We conclude with a summary of the achievements to date and identify the crucial steps needed to extend the capabilities for seismologists, for solid-Earth scientists and for similar disciplines.Comment: 14 pages, 3 figures. Pre-publication version of paper accepted and published at the IEEE eScience 2015 conference in Munich with substantial additions, particularly in the analysis of issue

    Optimal use of computing equipment in an automated industrial inspection context

    Get PDF
    This thesis deals with automatic defect detection. The objective was to develop the techniques required by a small manufacturing business to make cost-efficient use of inspection technology. In our work on inspection techniques we discuss image acquisition and the choice between custom and general-purpose processing hardware. We examine the classes of general-purpose computer available and study popular operating systems in detail. We highlight the advantages of a hybrid system interconnected via a local area network and develop a sophisticated suite of image-processing software based on it. We quantitatively study the performance of elements of the TCP/IP networking protocol suite and comment on appropriate protocol selection for parallel distributed applications. We implement our own distributed application based on these findings. In our work on inspection algorithms we investigate the potential uses of iterated function series and Fourier transform operators when preprocessing images of defects in aluminium plate acquired using a linescan camera. We employ a multi-layer perceptron neural network trained by backpropagation as a classifier. We examine the effect on the training process of the number of nodes in the hidden layer and the ability of the network to identify faults in images of aluminium plate. We investigate techniques for introducing positional independence into the network's behaviour. We analyse the pattern of weights induced in the network after training in order to gain insight into the logic of its internal representation. We conclude that the backpropagation training process is sufficiently computationally intensive so as to present a real barrier to further development in practical neural network techniques and seek ways to achieve a speed-up. Weconsider the training process as a search problem and arrive at a process involving multiple, parallel search "vectors" and aspects of genetic algorithms. We implement the system as the mentioned distributed application and comment on its performance

    GPU Computing for Cognitive Robotics

    Get PDF
    This thesis presents the first investigation of the impact of GPU computing on cognitive robotics by providing a series of novel experiments in the area of action and language acquisition in humanoid robots and computer vision. Cognitive robotics is concerned with endowing robots with high-level cognitive capabilities to enable the achievement of complex goals in complex environments. Reaching the ultimate goal of developing cognitive robots will require tremendous amounts of computational power, which was until recently provided mostly by standard CPU processors. CPU cores are optimised for serial code execution at the expense of parallel execution, which renders them relatively inefficient when it comes to high-performance computing applications. The ever-increasing market demand for high-performance, real-time 3D graphics has evolved the GPU into a highly parallel, multithreaded, many-core processor extraordinary computational power and very high memory bandwidth. These vast computational resources of modern GPUs can now be used by the most of the cognitive robotics models as they tend to be inherently parallel. Various interesting and insightful cognitive models were developed and addressed important scientific questions concerning action-language acquisition and computer vision. While they have provided us with important scientific insights, their complexity and application has not improved much over the last years. The experimental tasks as well as the scale of these models are often minimised to avoid excessive training times that grow exponentially with the number of neurons and the training data. This impedes further progress and development of complex neurocontrollers that would be able to take the cognitive robotics research a step closer to reaching the ultimate goal of creating intelligent machines. This thesis presents several cases where the application of the GPU computing on cognitive robotics algorithms resulted in the development of large-scale neurocontrollers of previously unseen complexity enabling the conducting of the novel experiments described herein.European Commission Seventh Framework Programm

    Granularity in Large-Scale Parallel Functional Programming

    Get PDF
    This thesis demonstrates how to reduce the runtime of large non-strict functional programs using parallel evaluation. The parallelisation of several programs shows the importance of granularity, i.e. the computation costs of program expressions. The aspect of granularity is studied both on a practical level, by presenting and measuring runtime granularity improvement mechanisms, and at a more formal level, by devising a static granularity analysis. By parallelising several large functional programs this thesis demonstrates for the first time the advantages of combining lazy and parallel evaluation on a large scale: laziness aids modularity, while parallelism reduces runtime. One of the parallel programs is the Lolita system which, with more than 47,000 lines of code, is the largest existing parallel non-strict functional program. A new mechanism for parallel programming, evaluation strategies, to which this thesis contributes, is shown to be useful in this parallelisation. Evaluation strategies simplify parallel programming by separating algorithmic code from code specifying dynamic behaviour. For large programs the abstraction provided by functions is maintained by using a data-oriented style of parallelism, which defines parallelism over intermediate data structures rather than inside the functions. A highly parameterised simulator, GRANSIM, has been constructed collaboratively and is discussed in detail in this thesis. GRANSIM is a tool for architecture-independent parallelisation and a testbed for implementing runtime-system features of the parallel graph reduction model. By providing an idealised as well as an accurate model of the underlying parallel machine, GRANSIM has proven to be an essential part of an integrated parallel software engineering environment. Several parallel runtime- system features, such as granularity improvement mechanisms, have been tested via GRANSIM. It is publicly available and in active use at several universities worldwide. In order to provide granularity information this thesis presents an inference-based static granularity analysis. This analysis combines two existing analyses, one for cost and one for size information. It determines an upper bound for the computation costs of evaluating an expression in a simple strict higher-order language. By exposing recurrences during cost reconstruction and using a library of recurrences and their closed forms, it is possible to infer the costs for some recursive functions. The possible performance improvements are assessed by measuring the parallel performance of a hand-analysed and annotated program

    Interactive simulation and rendering of fluids on graphics hardware

    Get PDF
    Computational uid dynamics can be used to reproduce the complex motion of fluids for use in computer graphics, but the simulation and rendering are both highly computationally intensive. In the past performing these tasks on the CPU could take many minutes per frame, especially for large scale scenes at high levels of detail, which limited their usage to offline applications such as in film and media. However, using the massive parallelism of GPUs, it is nowadays possible to produce uid visual effects in real time for interactive applications such as games. We present such an interactive simulation using the CUDA GPU computing environment and OpenGL graphics API. Smoothed Particle Hydrodynamics (SPH) is a popular particle-based fluid simulation technique that has been shown to be well suited to acceleration on the GPU. Our work extends an existing GPU-based SPH implementation by incorporating rigid body interaction and rendering. Solid objects are represented using particles to accumulate hydrodynamic forces from surrounding fluid, while motion and collision handling are handled by the Bullet Physics library on the CPU. Our system demonstrates two-way coupling with multiple objects floating, displacing fluid and colliding with each other. For rendering we compare the performance and memory consumption of two approaches, splatting and raycasting, we also describe the visual characteristics of each. In our evaluation we consider a target of between 24 and 30 fps to be sufficient for smooth interaction and aim to determine the performance impact of our new features. We begin by establishing a performance baseline and find that the original system runs smoothly up to 216,000 fluid particles but after introducing rendering this drops to 27,000 particles with the rendering taking up the majority of the frame time in both techniques. We find that the most significant limiting factor to splatting performance to be the onscreen area occupied by fluid while the raycasting performance is primarily determined by the resolution of the 3D texture used for sampling. Finally we find that performing solid interaction on the CPU is a viable approach that does not introduce significant overhead unless solid particles vastly outnumber fluid ones

    Applications Development for the Computational Grid

    Get PDF
    corecore