110,308 research outputs found
Recommended from our members
On the conditions for efficient interoperability with threads: An experience with PGAS languages using Cray communication domains
Today's high performance systems are typically built from shared memory nodes connected by a high speed network. That architecture, combined with the trend towards less memory per core, encourages programmers to use a mixture of message passing and multithreaded programming. Unfortunately, the advantages of using threads for in-node programming are hindered by their inability to efficiently communicate between nodes. In this work, we identify some of the performance problems that arise in such hybrid programming environments and characterize conditions needed to achieve high communication performance for multiple threads: addressability of targets, separability of communication paths, and full direct reachability to targets. Using the GASNet communication layer on the Cray XC30 as our experimental platform, we show how to satisfy these conditions. We also discuss how satisfying these conditions is influenced by the communication abstraction, implementation constraints, and the interconnect messaging capabilities. To evaluate these ideas, we compare the communication performance of a thread-based node runtime to a process-based runtime. Without our GASNet extensions, thread communication is significantly slower than processes - up to 21x slower. Once the implementation is modified to address each of our conditions, the two runtimes have comparable communication performance. This allows programmers to more easily mix models like OpenMP, CILK, or pthreads with a GASNet-based model like UPC, with the associated performance, convenience and interoperability advantages that come from using threads within a node. © 2014 ACM
libcppa - Designing an Actor Semantic for C++11
Parallel hardware makes concurrency mandatory for efficient program
execution. However, writing concurrent software is both challenging and
error-prone. C++11 provides standard facilities for multiprogramming, such as
atomic operations with acquire/release semantics and RAII mutex locking, but
these primitives remain too low-level. Using them both correctly and
efficiently still requires expert knowledge and hand-crafting. The actor model
replaces implicit communication by sharing with an explicit message passing
mechanism. It applies to concurrency as well as distribution, and a lightweight
actor model implementation that schedules all actors in a properly
pre-dimensioned thread pool can outperform equivalent thread-based
applications. However, the actor model did not enter the domain of native
programming languages yet besides vendor-specific island solutions. With the
open source library libcppa, we want to combine the ability to build reliable
and distributed systems provided by the actor model with the performance and
resource-efficiency of C++11.Comment: 10 page
A comprehensive approach in performance evaluation for modernreal-time operating systems
In real-time computing the accurate characterization of the performance and determinism that a particular real-time operating system/hardware combination can provide for real-time applications is essential. This issue is not properly addressed by existing performance metrics mainly due to the lack of completeness and generalization. In this paper we present a set of comprehensive, easy-to-implement and useful metrics covering three basic real-time operating system features: response to external events, intertask synchronization and resource sharing, and intertask data transferring. The evaluation of real-time operating systems using a set of fine-grained metrics is fundamental to guarantee that we can reach the required determinism in real-world applications.Publicad
Revisiting Actor Programming in C++
The actor model of computation has gained significant popularity over the
last decade. Its high level of abstraction makes it appealing for concurrent
applications in parallel and distributed systems. However, designing a
real-world actor framework that subsumes full scalability, strong reliability,
and high resource efficiency requires many conceptual and algorithmic additives
to the original model.
In this paper, we report on designing and building CAF, the "C++ Actor
Framework". CAF targets at providing a concurrent and distributed native
environment for scaling up to very large, high-performance applications, and
equally well down to small constrained systems. We present the key
specifications and design concepts---in particular a message-transparent
architecture, type-safe message interfaces, and pattern matching
facilities---that make native actors a viable approach for many robust,
elastic, and highly distributed developments. We demonstrate the feasibility of
CAF in three scenarios: first for elastic, upscaling environments, second for
including heterogeneous hardware like GPGPUs, and third for distributed runtime
systems. Extensive performance evaluations indicate ideal runtime behaviour for
up to 64 cores at very low memory footprint, or in the presence of GPUs. In
these tests, CAF continuously outperforms the competing actor environments
Erlang, Charm++, SalsaLite, Scala, ActorFoundry, and even the OpenMPI.Comment: 33 page
A Configurable Transport Layer for CAF
The message-driven nature of actors lays a foundation for developing scalable
and distributed software. While the actor itself has been thoroughly modeled,
the message passing layer lacks a common definition. Properties and guarantees
of message exchange often shift with implementations and contexts. This adds
complexity to the development process, limits portability, and removes
transparency from distributed actor systems.
In this work, we examine actor communication, focusing on the implementation
and runtime costs of reliable and ordered delivery. Both guarantees are often
based on TCP for remote messaging, which mixes network transport with the
semantics of messaging. However, the choice of transport may follow different
constraints and is often governed by deployment. As a first step towards
re-architecting actor-to-actor communication, we decouple the messaging
guarantees from the transport protocol. We validate our approach by redesigning
the network stack of the C++ Actor Framework (CAF) so that it allows to combine
an arbitrary transport protocol with additional functions for remote messaging.
An evaluation quantifies the cost of composability and the impact of individual
layers on the entire stack
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
Assessing load-sharing within optimistic simulation platforms
The advent of multi-core machines has lead to the need for revising the architecture of modern simulation platforms. One recent proposal we made attempted to explore the viability of load-sharing for optimistic simulators run on top of these types of machines. In this article, we provide an extensive experimental study for an assessment of the effects on run-time dynamics by a load-sharing architecture that has been implemented within the ROOT-Sim package, namely an open source simulation platform adhering to the optimistic synchronization paradigm. This experimental study is essentially aimed at evaluating possible sources of overheads when supporting load-sharing. It has been based on differentiated workloads allowing us to generate different execution profiles in terms of, e.g., granularity/locality of the simulation events. © 2012 IEEE
- âŠ