19 research outputs found

    EbbRT: Elastic Building Block Runtime - case studies

    Full text link
    We present a new systems runtime, EbbRT, for cloud hosted applications. EbbRT takes a different approach to the role operating systems play in cloud computing. It supports stitching application functionality across nodes running commodity OSs and nodes running specialized application specific software that only execute what is necessary to accelerate core functions of the application. In doing so, it allows tradeoffs between efficiency, developer productivity, and exploitation of elasticity and scale. EbbRT, as a software model, is a framework for constructing applications as collections of standard application software and Elastic Building Blocks (Ebbs). Elastic Building Blocks are components that encapsulate runtime software objects and are implemented to exploit the raw access, scale and elasticity of IaaS resources to accelerate critical application functionality. This paper presents the EbbRT architecture, our prototype and experimental evaluation of the prototype under three different application scenarios

    EbbRT: Elastic Building Block Runtime - overview

    Full text link
    EbbRT provides a lightweight runtime that enables the construction of reusable, low-level system software which can integrate with existing, general purpose systems. It achieves this by providing a library that can be linked into a process on an existing OS, and as a small library OS that can be booted directly on an IaaS node

    OpLog: a library for scaling update-heavy data structures

    Get PDF
    Existing techniques (e.g., RCU) can achieve good multi-core scaling for read-mostly data, but for update-heavy data structures only special-purpose techniques exist. This paper presents OpLog, a general-purpose library supporting good scalability for update-heavy data structures. OpLog achieves scalability by logging each update in a low-contention per-core log; it combines logs only when required by a read to the data structure. OpLog achieves generality by logging operations without having to understand them, to ease application to existing data structures. OpLog can further increase performance if the programmer indicates which operations can be combined in the logs. An evaluation shows how to apply OpLog to three update-heavy Linux kernel data structures. Measurements on a 48-core AMD server show that the result significantly improves the performance of the Apache web server and the Exim mail server under certain workloads

    Hare: a file system for non-cache-coherent multicores

    Get PDF
    Hare is a new file system that provides a POSIX-like interface on multicore processors without cache coherence. Hare allows applications on different cores to share files, directories, and file descriptors. The challenge in designing Hare is to support the shared abstractions faithfully enough to run applications that run on traditional shared-memory operating systems, with few modifications, and to do so while scaling with an increasing number of cores. To achieve this goal, Hare must support features (such as shared file descriptors) that traditional network file systems don't support, as well as implement them in a way that scales (e.g., shard a directory across servers to allow concurrent operations in that directory). Hare achieves this goal through a combination of new protocols (including a 3-phase commit protocol to implement directory operations correctly and scalably) and leveraging properties of non-cache-coherent multiprocessors (e.g., atomic low-latency message delivery and shared DRAM). An evaluation on a 40-core machine demonstrates that Hare can run many challenging Linux applications (including a mail server and a Linux kernel build) with minimal or no modifications. The results also show these applications achieve good scalability on Hare, and that Hare's techniques are important to achieving scalability.Quanta Computer (Firm

    On the development of slime mould morphological, intracellular and heterotic computing devices

    Get PDF
    The use of live biological substrates in the fabrication of unconventional computing (UC) devices is steadily transcending the barriers between science fiction and reality, but efforts in this direction are impeded by ethical considerations, the field’s restrictively broad multidisciplinarity and our incomplete knowledge of fundamental biological processes. As such, very few functional prototypes of biological UC devices have been produced to date. This thesis aims to demonstrate the computational polymorphism and polyfunctionality of a chosen biological substrate — slime mould Physarum polycephalum, an arguably ‘simple’ single-celled organism — and how these properties can be harnessed to create laboratory experimental prototypes of functionally-useful biological UC prototypes. Computing devices utilising live slime mould as their key constituent element can be developed into a) heterotic, or hybrid devices, which are based on electrical recognition of slime mould behaviour via machine-organism interfaces, b) whole-organism-scale morphological processors, whose output is the organism’s morphological adaptation to environmental stimuli (input) and c) intracellular processors wherein data are represented by energetic signalling events mediated by the cytoskeleton, a nano-scale protein network. It is demonstrated that each category of device is capable of implementing logic and furthermore, specific applications for each class may be engineered, such as image processing applications for morphological processors and biosensors in the case of heterotic devices. The results presented are supported by a range of computer modelling experiments using cellular automata and multi-agent modelling. We conclude that P. polycephalum is a polymorphic UC substrate insofar as it can process multimodal sensory input and polyfunctional in its demonstrable ability to undertake a variety of computing problems. Furthermore, our results are highly applicable to the study of other living UC substrates and will inform future work in UC, biosensing, and biomedicine

    Scalable elastic systems architecture

    Full text link
    Cloud computing has spurred the exploration and exploitation of elastic access to large scales of computing. To date the predominate building blocks by which elasticity has been exploited are applications and operating systems that are built around traditional computing infrastructure and programming models that are in-elastic or at best coarsely elastic. What would happen if application themselves could express and exploit elasticity in a fine grain fashion and this elasticity could be efficiently mapped to the scale and elasticity offered by modern cloud hardware systems? Would economic and market models that exploit elasticity pervade even the lowest levels? And would this enable greater efficiency both globally and individually? Would novel approaches to traditional problems such as quality of service arise? Would new applications be enabled both technically and economically? How to construct scalable and elastic software is an open challenge. Our work explores a systematic method for constructing and deploying such software. Building on several years of prior research, we will develop and evaluate a new cloud computing systems software architecture that addresses both scalability and elasticity. We explore a combination of a novel programming model and alternative operating systems structure. The goal of the architecture is to enable applications that inherently can scale up or down to react to changes in demand. We hypothesize that enabling such fine-grain elastic applications will open up new avenues for exploring both supply and demand elasticity across a broad range of research areas such as economic models, optimization, mechanism design, software engineering, networking and others.Department of Energy Office of Science (DE-SC0005365), National Science Foundation (1012798

    An Analysis of Linux Scalability to Many Cores

    Get PDF
    URL to paper from conference siteThis paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48- core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Using mostly standard parallel programming techniques— this paper introduces one new technique, sloppy counters— these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.Quanta Computer (Firm)National Science Foundation (U.S.) (0834415)National Science Foundation (U.S.) (0915164)Microsoft Research (Fellowship)Irwin Mark Jacobs and Joan Klein Jacobs Presidential Fellowshi

    Transistor scaled HPC application performance

    Full text link
    We propose a radically new, biologically inspired, model of extreme scale computer on which ap- plication performance automatically scales with the transistor count even in the face of component failures. Today high performance computers are massively parallel systems composed of potentially hundreds of thousands of traditional processor cores, formed from trillions of transistors, consuming megawatts of power. Unfortunately, increasing the number of cores in a system, unlike increasing clock frequencies, does not automatically translate to application level improvements. No general auto-parallelization techniques or tools exist for HPC systems. To obtain application improvements, HPC application programmers must manually cope with the challenge of multicore programming and the significant drop in reliability associated with the sheer number of transistors. Drawing on biological inspiration, the basic premise behind this work is that computation can be dramatically accelerated by integrating a very large-scale, system-wide, predictive associative memory into the operation of the computer. The memory effectively turns computation into a form of pattern recognition and prediction whose result can be used to avoid significant fractions of computation. To be effective the expectation is that the memory will require billions of concurrent devices akin to biological cortical systems, where each device implements a small amount of storage, computation and localized communication. As typified by the recent announcement of the Lyric GP5 Probability Processor, very efficient scalable hardware for pattern recognition and prediction are on the horizon. One class of such devices, called neuromorphic, was pioneered by Carver Mead in the 80’s to provide a path for breaking the power, scaling, and reliability barriers associated with standard digital VLSI tech- nology. Recent neuromorphic research examples include work at Stanford, MIT, and the DARPA Sponsored SyNAPSE Project. These devices operate transistors as unclocked analog devices orga- nized to implement pattern recognition and prediction several orders of magnitude more efficiently than functionally equivalent digital counterparts. Abstractly, the devices can be used to implement modern machine learning or statistical inference. When exposed to data as a time-varying signal, the devices learn and store patterns in the data at multiple time scales and constantly provide predictions about what the signal will do in the future. This kind of function can be seen as a form of predictive associative memory. In this paper we describe our model and initial plans for exploring it.Department of Energy Office of Science (DE-SC0005365), National Science Foundation (1012798
    corecore