563 research outputs found

    Experiences in teaching grid computing to advanced level students

    Get PDF
    The development of teaching materials for future software engineers is critical to the long term success of the grid. At present however there is considerable turmoil in the grid community both within the standards and the technology base underpinning these standards. In this context, it is especially challenging to develop teaching materials that have some sort of lifetime beyond the next wave of grid middleware and standards. In addition, the current way in which grid security is supported and delivered has two key problems. Firstly in the case of the UK e-Science community, scalability issues arise from a central certificate authority. Secondly, the current security mechanisms used by the grid community are not line grained enough. In this paper we outline how these issues are being addressed through the development of a grid computing module supported by an advanced authorisation infrastructure at the University of Glasgow

    Towards an Adaptive Skeleton Framework for Performance Portability

    Get PDF
    The proliferation of widely available, but very different, parallel architectures makes the ability to deliver good parallel performance on a range of architectures, or performance portability, highly desirable. Irregularly-parallel problems, where the number and size of tasks is unpredictable, are particularly challenging and require dynamic coordination. The paper outlines a novel approach to delivering portable parallel performance for irregularly parallel programs. The approach combines declarative parallelism with JIT technology, dynamic scheduling, and dynamic transformation. We present the design of an adaptive skeleton library, with a task graph implementation, JIT trace costing, and adaptive transformations. We outline the architecture of the protoype adaptive skeleton execution framework in Pycket, describing tasks, serialisation, and the current scheduler.We report a preliminary evaluation of the prototype framework using 4 micro-benchmarks and a small case study on two NUMA servers (24 and 96 cores) and a small cluster (17 hosts, 272 cores). Key results include Pycket delivering good sequential performance e.g. almost as fast as C for some benchmarks; good absolute speedups on all architectures (up to 120 on 128 cores for sumEuler); and that the adaptive transformations do improve performance

    A parallel grid-based implementation for real time processing of event log data in collaborative applications

    Get PDF
    Collaborative applications usually register user interaction in the form of semi-structured plain text event log data. Extracting and structuring of data is a prerequisite for later key processes such as the analysis of interactions, assessment of group activity, or the provision of awareness and feedback. Yet, in real situations of online collaborative activity, the processing of log data is usually done offline since structuring event log data is, in general, a computationally costly process and the amount of log data tends to be very large. Techniques to speed and scale up the structuring and processing of log data with minimal impact on the performance of the collaborative application are thus desirable to be able to process log data in real time. In this paper, we present a parallel grid-based implementation for processing in real time the event log data generated in collaborative applications. Our results show the feasibility of using grid middleware to speed and scale up the process of structuring and processing semi-structured event log data. The Grid prototype follows the Master-Worker (MW) paradigm. It is implemented using the Globus Toolkit (GT) and is tested on the Planetlab platform

    A novel implementation of computational aerodynamic shape optimisation using Modified Cuckoo Search

    Get PDF
    This paper outlines a new computational aerodynamic design optimisation algorithm using a novel method of parameterising a computational mesh using `control nodes'. The shape boundary movement as well as the mesh movement is coupled to the movement of user--defined control nodes via a Delaunay Graph Mapping technique. A Modified Cuckoo Search algorithm is employed for optimisation within the prescribed design space defined by the allowed range of control node displacement. A finite volume compressible Navier--Stokes solver is used for aerodynamic modelling to predict aerodynamic design `fitness'. The resulting coupled algorithm is applied to a range of test cases in two dimensions including aerofoil lift--drag ratio optimisation intake duct optimisation under subsonic, transonic and supersonic flow conditions. The discrete (mesh--based) optimisation approach presented is demonstrated to be effective in terms of its generalised applicability and intuitiveness

    MSRL: Distributed Reinforcement Learning with Dataflow Fragments

    Get PDF
    Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training algorithms offer different opportunities for distributing and parallelising the computation. Yet, current distributed RL systems tie the definition of RL algorithms to their distributed execution: they hard-code particular distribution strategies and only accelerate specific parts of the computation (e.g. policy network updates) on GPU workers. Fundamentally, current systems lack abstractions that decouple RL algorithms from their execution. We describe MindSpore Reinforcement Learning (MSRL), a distributed RL training system that supports distribution policies that govern how RL training computation is parallelised and distributed on cluster resources, without requiring changes to the algorithm implementation. MSRL introduces the new abstraction of a fragmented dataflow graph, which maps Python functions from an RL algorithm's training loop to parallel computational fragments. Fragments are executed on different devices by translating them to low-level dataflow representations, e.g. computational graphs as supported by deep learning engines, CUDA implementations or multi-threaded CPU processes. We show that MSRL subsumes the distribution strategies of existing systems, while scaling RL training to 64 GPUs

    Machine Learning-Based Elastic Cloud Resource Provisioning in the Solvency II Framework

    Get PDF
    The Solvency II Directive (Directive 2009/138/EC) is a European Directive issued in November 2009 and effective from January 2016, which has been enacted by the European Union to regulate the insurance and reinsurance sector through the discipline of risk management. Solvency II requires European insurance companies to conduct consistent evaluation and continuous monitoring of risks—a process which is computationally complex and extremely resource-intensive. To this end, companies are required to equip themselves with adequate IT infrastructures, facing a significant outlay. In this paper we present the design and the development of a Machine Learning-based approach to transparently deploy on a cloud environment the most resource-intensive portion of the Solvency II-related computation. Our proposal targets DISAR®, a Solvency II-oriented system initially designed to work on a grid of conventional computers. We show how our solution allows to reduce the overall expenses associated with the computation, without hampering the privacy of the companies’ data (making it suitable for conventional public cloud environments), and allowing to meet the strict temporal requirements required by the Directive. Additionally, the system is organized as a self-optimizing loop, which allows to use information gathered from actual (useful) computations, thus requiring a shorter training phase. We present an experimental study conducted on Amazon EC2 to assess the validity and the efficiency of our proposal

    SPRINT: more runners, fewer hurdles

    Get PDF
    corecore