6,872 research outputs found
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlang’s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
TensorFlow Enabled Genetic Programming
Genetic Programming, a kind of evolutionary computation and machine learning
algorithm, is shown to benefit significantly from the application of vectorized
data and the TensorFlow numerical computation library on both CPU and GPU
architectures. The open source, Python Karoo GP is employed for a series of 190
tests across 6 platforms, with real-world datasets ranging from 18 to 5.5M data
points. This body of tests demonstrates that datasets measured in tens and
hundreds of data points see 2-15x improvement when moving from the scalar/SymPy
configuration to the vector/TensorFlow configuration, with a single core
performing on par or better than multiple CPU cores and GPUs. A dataset
composed of 90,000 data points demonstrates a single vector/TensorFlow CPU core
performing 875x better than 40 scalar/Sympy CPU cores. And a dataset containing
5.5M data points sees GPU configurations out-performing CPU configurations on
average by 1.3x.Comment: 8 pages, 5 figures; presented at GECCO 2017, Berlin, German
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Parallel Discrete Event Simulation with Erlang
Discrete Event Simulation (DES) is a widely used technique in which the state
of the simulator is updated by events happening at discrete points in time
(hence the name). DES is used to model and analyze many kinds of systems,
including computer architectures, communication networks, street traffic, and
others. Parallel and Distributed Simulation (PADS) aims at improving the
efficiency of DES by partitioning the simulation model across multiple
processing elements, in order to enabling larger and/or more detailed studies
to be carried out. The interest on PADS is increasing since the widespread
availability of multicore processors and affordable high performance computing
clusters. However, designing parallel simulation models requires considerable
expertise, the result being that PADS techniques are not as widespread as they
could be. In this paper we describe ErlangTW, a parallel simulation middleware
based on the Time Warp synchronization protocol. ErlangTW is entirely written
in Erlang, a concurrent, functional programming language specifically targeted
at building distributed systems. We argue that writing parallel simulation
models in Erlang is considerably easier than using conventional programming
languages. Moreover, ErlangTW allows simulation models to be executed either on
single-core, multicore and distributed computing architectures. We describe the
design and prototype implementation of ErlangTW, and report some preliminary
performance results on multicore and distributed architectures using the well
known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance
Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-
Synthesis of application specific processor architectures for ultra-low energy consumption
In this paper we suggest that further energy savings can be achieved by a new approach to synthesis of embedded processor cores, where the architecture is tailored to the algorithms that the core executes. In the context of embedded processor synthesis, both single-core and many-core, the types of algorithms and demands on the execution efficiency are usually known at the chip design time. This knowledge can be utilised at the design stage to synthesise architectures optimised for energy consumption. Firstly, we present an overview of both traditional energy saving techniques and new developments in architectural approaches to energy-efficient processing. Secondly, we propose a picoMIPS architecture that serves as an architectural template for energy-efficient synthesis. As a case study, we show how the picoMIPS architecture can be tailored to an energy efficient execution of the DCT algorithm
- …