Search CORE

171 research outputs found

RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

Author: Chechina Natalia
Trinder Phil
Publication venue
Publication date: 01/01/2012
Field of study

Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

Enlighten

Performance Portability Through Semi-explicit Placement in Distributed Erlang

Author: Chechina Natalia
MacKenzie Kenneth
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/08/2015
Field of study

We consider the problem of adapting distributed Erlang applications to large or heterogeneous architectures to achieve good performance in a portable way. In many architectures, and especially large architectures, the communication latency between pairs of virtual machines (nodes) is no longer uniform. We propose two language-level methods that enable programs to automatically adapt to heterogeneity and non-uniform communication latencies, and both provide information enabling a program to identify an appropriate node when spawning a process. We provide a means of recording node attributes describing the hardware and software capabilities of nodes, and mechanisms that allow an application to examine the attributes of remote nodes. We provide an abstraction of communication distances that enables an application to select nodes to facilitate efficient communication. We have developed open source libraries that implement these ideas. We show that the use of attributes for node selection can lead to significant performance improvements if different components of the application have different processing requirements. We report a detailed empirical investigation of non-uniform communication times in several representative architectures, and show that our abstract model provides a good description of the hierarchy of communication times

Crossref

Enlighten

Using Negotiation to Reduce Redundant Autonomous Mobile Program Movements

Author: Chechina N.
King P.
Trinder P.W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Distributed load managers exhibit thrashing where tasks are repeatedly moved between locations due to incomplete global load information. This paper shows that systems of Autonomous Mobile Programs (AMPs) exhibit the same behaviour, identifying two types of redundant movement and terming them greedy effects. AMPs are unusual in that, in place of some external load management system, each AMP periodically recalculates network and program parameters and may independently move to a better execution environment. Load management emerges from the behaviour of collections of AMPs. The paper explores the extent of greedy effects by simulation, and then proposes negotiating AMPs (NAMPs) to ameliorate the problem. We present the design of AMPs with a competitive negotiation scheme (cNAMPs), and compare their performance with AMPs by simulation

CiteSeerX

Heriot Watt Pure

Crossref

Bournemouth University Research Online

Enlighten

Scalable SD Erlang Reliability Model

Author: Chechina Natalia
Huiqing Li
Thompson Simon
Trinder Phil
Publication venue: Glasgow University
Publication date: 23/12/2014
Field of study

This technical report presents the work we have conducted to support SD Erlang reliability and to formally specify the semantics of s groups. We have considered the following aspects of SD Erlang reliability: node recovery after failures and s group name uniqueness

Enlighten

Scalable SD Erlang Computation Model

Author: Chechina Natalia
Ghaffari Amir
Huiqing Li
Trinder Phil
Publication venue: Glasgow University
Publication date: 23/12/2014
Field of study

The technical report presents implementation of s groups and semi-explicit placement of the Scalable Distributed (SD) Erlang. The implementation is done on the basis of Erlang/OTP 17.4. The source code can be found in https://github.com/release-project/otp/tree/17.4-rebased. We start with a discussion of differences between distributed Erlang global groups and SD Erlang s groups (Chapter 1). Then we discuss the implementation of s groups and the features of sixteen functions that were modified and introduced in global and s group modules (Chapter 2). After that we discuss semi-explicit placement, node attributes and choose node/1 function (Chapter 3). These functions were unit tested (Chapter 4). Finally, we discuss future work (Chapter 5)

Enlighten

Redundant movements in autonomous mobility: experimental and theoretical analysis

Author: Backschat
Casavant
Casavant
Chakravarti
Chechina
Chechina
Cybenko
Deng
Deng
El-Abd
Fudenberg
Fuggetta
Georgousopoulos
Ghafoor
Gray
Kale
Kephart
Kirli
Lange
Legrand
Lin
Livny
Milojic˘ić
Natalia Chechina
Ni
Peter King
Phil Trinder
Ross
Rotithor
Schlegel
Shirazi
Stender
Wooldridge
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

<p>Distributed load balancers exhibit thrashing where tasks are repeatedly moved between locations due to incomplete global load information. This paper shows that systems of autonomous mobile programs (AMPs) exhibit the same behaviour, and identifies two types of redundant movement (greedy effect). AMPs are unusual in that, in place of some external load management system, each AMP periodically recalculates network and program parameters and may independently move to a better execution environment. Load management emerges from the behaviour of collections of AMPs.</p> <p>The paper explores the extent of greedy effects by simulating collections of AMPs and proposes negotiating AMPs (NAMPs) to ameliorate the problem. We present the design of AMPs with a competitive negotiation scheme (cNAMPs), and compare their performance with AMPs by simulation. We establish new properties of balanced networks of AMPs, and use these to provide a theoretical analysis of greedy effects.</p&gt

Crossref

Bournemouth University Research Online

Enlighten

A scalable reliable instant messenger using the SD Erlang libraries

Author: Chechina Natalia
Moro Hernandez Mario
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Erlang has world leading reliability capabilities, but while it scales extremely well within a single node, distributed Erlang has some scalability issues. The Scalable Distributed (SD) Erlang libraries have been designed to address the scalability limitations while preserving the reliability model, and shown to deliver significant performance benefits above 40 hosts using some relatively simple benchmarks. This paper compares the reliability and scalability of SD Erlang and distributed Erlang using an Instant Messaging (IM) server benchmark that is a far more typical Erlang application; a relatively large and sophisticated benchmark; has throughput as the key performance metric; and uses non-trivial reliability mechanisms. We provide a careful reliability evaluation using chaos monkey. The key performance results consider scenarios with and without failures on up to 17 server hosts (272 cores). We show that SD Erlang adds no performance overhead when all nodes are grouped in a single s_group. However, either adding redundant router nodes in distributed Erlang applications, or dividing a set of nodes into small s_groups in SD Erlang applications, have small negative impact. Both the distributed Erlang and SD Erlang IM tolerate failures and, up to the failure rates measured, the failures have no impact on throughput. The IM implementations show that SD Erlang preserves the distributed Erlang reliability properties and mechanisms

Crossref

Enlighten

Scalable Persistent Storage for Erlang

Author: Chechina Natalia
Ghaffari Amir
Meredith Jon
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

The many core revolution makes scalability a key property. The RELEASE project aims to improve the scalability of Erlang on emergent commodity architectures with 100,000 cores. Such architectures require scalable and available persistent storage on up to 100 hosts. We enumerate the requirements for scalable and available persistent storage, and evaluate four popular Erlang DBMSs against these requirements. This analysis shows that Mnesia and CouchDB are not suitable persistent storage at our target scale, but Dynamo-like NoSQL DataBase Management Systems (DBMSs) such as Cassandra and Riak potentially are. We investigate the current scalability limits of the Riak 1.1.1 NoSQL DBMS in practice on a 100-node cluster. We establish for the first time scientifically the scalability limit of Riak as 60 nodes on the Kalkyl cluster, thereby confirming developer folklore. We show that resources like memory, disk, and network do not limit the scalability of Riak. By instrumenting Erlang/OTP and Riak libraries we identify a specific Riak functionality that limits scalability. We outline how later releases of Riak are refactored to eliminate the scalability bottlenecks. We conclude that Dynamo-style NoSQL DBMSs provide scalable and available persistent storage for Erlang in general, and for our RELEASE target architecture in particular

Crossref

Enlighten

A Reliable Instant Messenger in Erlang: Design and Evaluation

Author: Chechina Natalia
Hernandez Mario Moro
Trinder Phil
Publication venue: Glasgow University
Publication date: 17/12/2015
Field of study

This document describes the design and evaluation of two Erlang-based instant messenger systems using Distributed Erlang (D-Erlang) and Scalable Distributed Erlang (SD-Erlang). The purpose of these systems is to serve as real-world benchmarks to test the performance of the SD Erlang library

Enlighten

Simulating Autonomous Mobile Programs on Networks

Author: Chechina Natalia
King Peter
Pooley Rob
Trinder Phil
Publication venue: Liverpool John Moores University
Publication date
Field of study

Autonomous mobile programs (AMPs) have been proposed for load management in dynamic networks. An AMP is aware of its resource needs and periodically seeks a better location in the network to reduce execution time. AMPs have previously been measured using mobile Java Voyager on local area networks (LANs). We have constructed a simulation model of AMPs and reproduced 4 sets of experiments on homogeneous networks, i.e. networks where all locations have the same processor speed, and 2 sets of experiments on heterogeneous networks with collection of large and small AMPs. The results show that simulated collections of AMPs obtain similar balanced states to those reached in the real experiments, and have only minor differences from real experimental results. The simulation model gives an opportunity to explore the greedy effect that can be observed in the real experiments. This gives us confidence to apply the simulation model for further investigation of AMP behaviour, including behaviours on wide area networks

Enlighten