517 research outputs found

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlangā€™s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

    Multi-level Visualization of Concurrent and Distributed Computation in Erlang

    Get PDF
    This paper describes a prototype visualization system for concurrent and distributed applications programmed using Erlang, providing two levels of granularity of view. Both visualizations are animated to show the dynamics of aspects of the computation. At the low level, we show the concurrent behaviour of the Erlang schedulers on a single instance of the Erlang virtual machine, which we call an Erlang node. Typically there will be one scheduler per core on a multicore system. Each scheduler maintains a run queue of processes to execute, and we visualize the migration of Erlang concurrent processes from one run queue to another as work is redistributed to fully exploit the hardware. The schedulers are shown as a graph with a circular layout. Next to each scheduler we draw a variable length bar indicating the current size of the run queue for the scheduler. At the high level, we visualize the distributed aspects of the system, showing interactions between Erlang nodes as a dynamic graph drawn with a force model. Speci?cally we show message passing between nodes as edges and lay out nodes according to their current connections. In addition, we also show the grouping of nodes into ā€œs_groupsā€ using an Euler diagram drawn with circles

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

    Scaling Reliably: Improving the Scalability of the Erlang Distributed Actor Platform

    Get PDF
    Distributed actor languages are an effective means of constructing scalable reliable systems, and the Erlang programming language has a well-established and influential model. While the Erlang model conceptually provides reliable scalability, it has some inherent scalability limits and these force developers to depart from the model at scale. This article establishes the scalability limits of Erlang systems and reports the work of the EU RELEASE project to improve the scalability and understandability of the Erlang reliable distributed actor model. We systematically study the scalability limits of Erlang and then address the issues at the virtual machine, language, and tool levels. More specifically: (1) We have evolved the Erlang virtual machine so that it can work effectively in large-scale single-host multicore and NUMA architectures. We have made important changes and architectural improvements to the widely used Erlang/OTP release. (2) We have designed and implemented Scalable Distributed (SD) Erlang libraries to address language-level scalability issues and provided and validated a set of semantics for the new language constructs. (3) To make large Erlang systems easier to deploy, monitor, and debug, we have developed and made open source releases of five complementary tools, some specific to SD Erlang. Throughout the article we use two case studies to investigate the capabilities of our new technologies and tools: a distributed hash table based Orbit calculation and Ant Colony Optimisation (ACO). Chaos Monkey experiments show that two versions of ACO survive random process failure and hence that SD Erlang preserves the Erlang reliability model. While we report measurements on a range of NUMA and cluster architectures, the key scalability experiments are conducted on the Athos cluster with 256 hosts (6,144 cores). Even for programs with no global recovery data to maintain, SD Erlang partitions the network to reduce network traffic and hence improves performance of the Orbit and ACO benchmarks above 80 hosts. ACO measurements show that maintaining global recovery data dramatically limits scalability; however, scalability is recovered by partitioning the recovery data. We exceed the established scalability limits of distributed Erlang, and do not reach the limits of SD Erlang for these benchmarks at this scal

    Scalable Reliable SD Erlang Design

    Get PDF
    This technical report presents the design of Scalable Distributed (SD) Erlang: a set of language-level changes that aims to enable Distributed Erlang to scale for server applications on commodity hardware with at most 100,000 cores. We cover a number of aspects, specifically anticipated architecture, anticipated failures, scalable data structures, and scalable computation. Other two components that guided us in the design of SD Erlang are design principles and typical Erlang applications. The design principles summarise the type of modifications we aim to allow Erlang scalability. Erlang exemplars help us to identify the main Erlang scalability issues and hypothetically validate the SD Erlang design

    Analysis of Distributed Systems Dynamics with Erlang Performance Lab

    Get PDF
    Modern, highly concurrent and large-scale systems require new methods for design, testing and monitoring. Their dynamics and scale require real-time tools, providing a holistic view of the whole system and the ability of showing a more detailed view when needed. Such tools can help identifying the causes of unwanted states, which is hardly possible with static analysis or metrics-based approach. In this paper a new tool for analysis of distributed systems in Erlang is presented. It provides real-time monitoring of system dynamics on different levels of abstraction. The tool has been used for analyzing a large-scale urban traffic simulation system running on a cluster of 20 computing nodes

    Towards Type-Based Optimizations in Distributed Applications Using ABS and JAVA 8

    Get PDF
    In this paper we present an API to support modeling applications with Actors based on the paradigm of the Abstract Behavioural Specification (ABS) language. With the introduction of JAVA 8, we expose this API through a JAVA library to allow for a high-level actor-based methodology for programming distributed systems which supports the programming to interfaces discipline. We validate this solution through a case study where we obtain significant performance improvements as well as illustrating the ease with which simple high and low-level optimizations can be obtained by examining topologies and communication within an application. Using this API we show it is much easier to observe drawbacks of shared data-structures and communications methods in the design phase of a distributed application and apply the necessary corrections in order to obtain better results

    High Performance Web Servers: A Study In Concurrent Programming Models

    Get PDF
    With the advent of commodity large-scale multi-core computers, the performance of software running on these computers has become a challenge to researchers and enterprise developers. While academic research and industrial products have moved in the direction of writing scalable and highly available services using distributed computing, single machine performance remains an active domain, one which is far from saturated. This thesis selects an archetypal software example and workload in this domain, and describes software characteristics affecting performance. The example is highly-parallel web-servers processing a static workload. Particularly, this work examines concurrent programming models in the context of high-performance web-servers across different architectures ā€” threaded (Apache, Go and Ī¼Knot), event-driven (Nginx, Ī¼Server) and staged (WatPipe) ā€” compared with two static workloads in two different domains. The two workloads are a Zipf distribution of file sizes representing a user session pulling an assortment of many small and a few large files, and a 50KB file representing chunked streaming of a large audio or video file. Significant effort is made to fairly compare eight web-servers by carefully tuning each via their adjustment parameters. Tuning plays a significant role in workload-specific performance. The two domains are no disk I/O (in-memory file set) and medium disk I/O. The domains are created by lowering the amount of RAM available to the web-server from 4GB to 2GB, forcing files to be evicted from the file-system cache. Both domains are also restricted to 4 CPUs. The primary goal of this thesis is to examine fundamental performance differences between threaded and event-driven concurrency models, with particular emphasis on user-level threading models. Additionally, a secondary goal of the work is to examine high-performance software under restricted hardware environments. Over-provisioned hardware environments can mask architectural and implementation shortcomings in software ā€“ the hypothesis in this work is that restricting resources stresses the application, bringing out important performance characteristics and properties. Experimental results for the given workload show that memory pressure is one of the most significant factors for the degradation of web-server performance, because it forces both the onset and amount of disk I/O. With an ever increasing need to support more content at faster rates, a web-server relies heavily on in-memory caching of files and related content. In fact, personal and small business web-servers are even run on minimal hardware, like the Raspberry Pi, with only 1GB of RAM and a small SD card for the file system. Therefore, understanding behaviour and performance in restricted contexts should be a normal aspect of testing a web server (and other software systems)

    Minimizing average handling time in contact centers by introducing a new process: Rowan Support Desk case study

    Get PDF
    Quality of a call center performance is an important factor in insuring customer satisfaction. Customers, the callers , want their requests solved quickly, permanently and to their satisfaction. Often, there are staff constraints, budget or cost limitation, and the Service Level Agreement (SLA) which is resource availability to accomplish a task within a deadline. The purpose of this research is to analyze feasible approaches to minimize the long-lasting open requests and enhance a call center\u27s performance. Multiple challenges that a call center often faces in handling requests are studied to identify key bottlenecks in the process of handling requests. Rowan University support desk is used as a case study. The focus of this study is on over-extended unsolved requests under set of specific constraints. The following two alternative solutions were investigated and compared. One involves reorganizing the routing procedure, which would allow a ticket to be rerouted to the specialists. The other scenario investigates an increase in staff and efficiencies that would come with it. The research will show that with minimal effort in rerouting the unsolved tickets, we can decrease average handling time which simultaneously increases the total number of resolved tickets and minimize total processing time
    • ā€¦
    corecore