4,055 research outputs found

    Mesmerizer: A Effective Tool for a Complete Peer-to-Peer Software Development Life-cycle

    Get PDF
    In this paper we present what are, in our experience, the best practices in Peer-To-Peer(P2P) application development and how we combined them in a middleware platform called Mesmerizer. We explain how simulation is an integral part of the development process and not just an assessment tool. We then present our component-based event-driven framework for P2P application development, which can be used to execute multiple instances of the same application in a strictly controlled manner over an emulated network layer for simulation/testing, or a single application in a concurrent environment for deployment purpose. We highlight modeling aspects that are of critical importance for designing and testing P2P applications, e.g. the emulation of Network Address Translation and bandwidth dynamics. We show how our simulator scales when emulating low-level bandwidth characteristics of thousands of concurrent peers while preserving a good degree of accuracy compared to a packet-level simulator

    MaSiF: Machine learning guided auto-tuning of parallel skeletons

    Get PDF

    Locality-Aware Concurrency Platforms

    Get PDF
    Modern computing systems from all domains are becoming increasingly more parallel. Manufacturers are taking advantage of the increasing number of available transistors by packaging more and more computing resources together on a single chip or within a single system. These platforms generally contain many levels of private and shared caches in addition to physically distributed main memory. Therefore, some memory is more expensive to access than other and high-performance software must consider memory locality as one of the first level considerations. Memory locality is often difficult for application developers to consider directly, however, since many of these NUMA affects are invisible to the application programmer and only show up in low performance. Moreover, on parallel platforms, the performance depends on both locality and load balance and these two metrics are often at odds with each other. Therefore, directly considering locality and load balance at the application level may make the application much more complex to program. In this work, we develop locality-conscious concurrency platforms for multiple different structured parallel programming models, including streaming applications, task-graphs and parallel for loops. In all of this work, the idea is to minimally disrupt the application programming model so that the application developer is either unimpacted or must only provide high-level hints to the runtime system. The runtime system then schedules the application to provide good locality of access while, at the same time also providing good load balance. In particular, we address cache locality for streaming applications through static partitioning and developed an extensible platform to execute partitioned streaming applications. For task-graphs, we extend a task-graph scheduling library to guide scheduling decisions towards better NUMA locality with the help of user-provided locality hints. CilkPlus parallel for loops utilize a randomized dynamic scheduler to distribute work which, in many loop based applications, results in poor locality at all levels of the memory hierarchy. We address this issue with a novel parallel for loop implementation that can get good cache and NUMA locality while providing support to maintain good load balance dynamically

    A Configurable Transport Layer for CAF

    Full text link
    The message-driven nature of actors lays a foundation for developing scalable and distributed software. While the actor itself has been thoroughly modeled, the message passing layer lacks a common definition. Properties and guarantees of message exchange often shift with implementations and contexts. This adds complexity to the development process, limits portability, and removes transparency from distributed actor systems. In this work, we examine actor communication, focusing on the implementation and runtime costs of reliable and ordered delivery. Both guarantees are often based on TCP for remote messaging, which mixes network transport with the semantics of messaging. However, the choice of transport may follow different constraints and is often governed by deployment. As a first step towards re-architecting actor-to-actor communication, we decouple the messaging guarantees from the transport protocol. We validate our approach by redesigning the network stack of the C++ Actor Framework (CAF) so that it allows to combine an arbitrary transport protocol with additional functions for remote messaging. An evaluation quantifies the cost of composability and the impact of individual layers on the entire stack

    Parallel processing of streaming media on heterogeneous hosts using work stealing

    Get PDF
    Master'sMASTER OF SCIENC
    corecore