12,820 research outputs found

    Prototyping Formal System Models with Active Objects

    Full text link
    We propose active object languages as a development tool for formal system models of distributed systems. Additionally to a formalization based on a term rewriting system, we use established Software Engineering concepts, including software product lines and object orientation that come with extensive tool support. We illustrate our modeling approach by prototyping a weak memory model. The resulting executable model is modular and has clear interfaces between communicating participants through object-oriented modeling. Relaxations of the basic memory model are expressed as self-contained variants of a software product line. As a modeling language we use the formal active object language ABS which comes with an extensive tool set. This permits rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs. Hence, our approach supports the prototyping of formal system models with early feedback.Comment: In Proceedings ICE 2018, arXiv:1810.0205

    Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

    Full text link
    The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM

    Pervasive Parallel And Distributed Computing In A Liberal Arts College Curriculum

    Get PDF
    We present a model for incorporating parallel and distributed computing (PDC) throughout an undergraduate CS curriculum. Our curriculum is designed to introduce students early to parallel and distributed computing topics and to expose students to these topics repeatedly in the context of a wide variety of CS courses. The key to our approach is the development of a required intermediate-level course that serves as a introduction to computer systems and parallel computing. It serves as a requirement for every CS major and minor and is a prerequisite to upper-level courses that expand on parallel and distributed computing topics in different contexts. With the addition of this new course, we are able to easily make room in upper-level courses to add and expand parallel and distributed computing topics. The goal of our curricular design is to ensure that every graduating CS major has exposure to parallel and distributed computing, with both a breadth and depth of coverage. Our curriculum is particularly designed for the constraints of a small liberal arts college, however, much of its ideas and its design are applicable to any undergraduate CS curriculum

    High-Performance Distributed ML at Scale through Parameter Server Consistency Models

    Full text link
    As Machine Learning (ML) applications increase in data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands. Unfortunately, effective use of clusters for ML requires considerable expertise in writing distributed code, while highly-abstracted frameworks like Hadoop have not, in practice, approached the performance seen in specialized ML implementations. The recent Parameter Server (PS) paradigm is a middle ground between these extremes, allowing easy conversion of single-machine parallel ML applications into distributed ones, while maintaining high throughput through relaxed "consistency models" that allow inconsistent parameter reads. However, due to insufficient theoretical study, it is not clear which of these consistency models can really ensure correct ML algorithm output; at the same time, there remain many theoretically-motivated but undiscovered opportunities to maximize computational throughput. Motivated by this challenge, we study both the theoretical guarantees and empirical behavior of iterative-convergent ML algorithms in existing PS consistency models. We then use the gleaned insights to improve a consistency model using an "eager" PS communication mechanism, and implement it as a new PS system that enables ML algorithms to reach their solution more quickly.Comment: 19 pages, 2 figure

    Design and analysis of adaptive hierarchical low-power long-range networks

    Get PDF
    A new phase of evolution of Machine-to-Machine (M2M) communication has started where vertical Internet of Things (IoT) deployments dedicated to a single application domain gradually change to multi-purpose IoT infrastructures that service different applications across multiple industries. New networking technologies are being deployed operating over sub-GHz frequency bands that enable multi-tenant connectivity over long distances and increase network capacity by enforcing low transmission rates to increase network capacity. Such networking technologies allow cloud-based platforms to be connected with large numbers of IoT devices deployed several kilometres from the edges of the network. Despite the rapid uptake of Long-power Wide-area Networks (LPWANs), it remains unclear how to organize the wireless sensor network in a scaleable and adaptive way. This paper introduces a hierarchical communication scheme that utilizes the new capabilities of Long-Range Wireless Sensor Networking technologies by combining them with broadly used 802.11.4-based low-range low-power technologies. The design of the hierarchical scheme is presented in detail along with the technical details on the implementation in real-world hardware platforms. A platform-agnostic software firmware is produced that is evaluated in real-world large-scale testbeds. The performance of the networking scheme is evaluated through a series of experimental scenarios that generate environments with varying channel quality, failing nodes, and mobile nodes. The performance is evaluated in terms of the overall time required to organize the network and setup a hierarchy, the energy consumption and the overall lifetime of the network, as well as the ability to adapt to channel failures. The experimental analysis indicate that the combination of long-range and short-range networking technologies can lead to scalable solutions that can service concurrently multiple applications

    LEGOStore: A Linearizable Geo-Distributed Store Combining Replication and Erasure Coding

    Full text link
    We design and implement LEGOStore, an erasure coding (EC) based linearizable data store over geo-distributed public cloud data centers (DCs). For such a data store, the confluence of the following factors opens up opportunities for EC to be latency-competitive with replication: (a) the necessity of communicating with remote DCs to tolerate entire DC failures and implement linearizability; and (b) the emergence of DCs near most large population centers. LEGOStore employs an optimization framework that, for a given object, carefully chooses among replication and EC, as well as among various DC placements to minimize overall costs. To handle workload dynamism, LEGOStore employs a novel agile reconfiguration protocol. Our evaluation using a LEGOStore prototype spanning 9 Google Cloud Platform DCs demonstrates the efficacy of our ideas. We observe cost savings ranging from moderate (5-20\%) to significant (60\%) over baselines representing the state of the art while meeting tail latency SLOs. Our reconfiguration protocol is able to transition key placements in 3 to 4 inter-DC RTTs (<< 1s in our experiments), allowing for agile adaptation to dynamic conditions
    • …
    corecore