387 research outputs found
Continuation-Passing C: compiling threads to events through continuations
In this paper, we introduce Continuation Passing C (CPC), a programming
language for concurrent systems in which native and cooperative threads are
unified and presented to the programmer as a single abstraction. The CPC
compiler uses a compilation technique, based on the CPS transform, that yields
efficient code and an extremely lightweight representation for contexts. We
provide a proof of the correctness of our compilation scheme. We show in
particular that lambda-lifting, a common compilation technique for functional
languages, is also correct in an imperative language like C, under some
conditions enforced by the CPC compiler. The current CPC compiler is mature
enough to write substantial programs such as Hekate, a highly concurrent
BitTorrent seeder. Our benchmark results show that CPC is as efficient, while
using significantly less space, as the most efficient thread libraries
available.Comment: Higher-Order and Symbolic Computation (2012). arXiv admin note:
substantial text overlap with arXiv:1202.324
CPC: programming with a massive number of lightweight threads
Threads are a convenient and modular abstraction for writing concurrent
programs, but often fairly expensive. The standard alternative to threads,
event-loop programming, allows much lighter units of concurrency, but leads to
code that is difficult to write and even harder to understand. Continuation
Passing C (CPC) is a translator that converts a program written in threaded
style into a program written with events and native system threads, at the
programmer's choice. Together with two undergraduate students, we taught
ourselves how to program in CPC by writing Hekate, a massively concurrent
network server designed to efficiently handle tens of thousands of
simultaneously connected peers. In this paper, we describe a number of
programming idioms that we learnt while writing Hekate; while some of these
idioms are specific to CPC, many should be applicable to other programming
systems with sufficiently cheap threads.Comment: To appear in PLACES'1
Stackless Multi-Threading for Embedded Systems
Programming support for multi-threaded applications on embedded microcontroller platforms has attracted a considerable amount of research attention in the recent years. This paper is focused on this problem, and presents UnStacked C, a source-to-source transformation that can translate multithreaded programs written in C into stackless continuations. The transformation can support legacy code by not requiring any changes to application code; only the underlying threading library needs modifications. We describe the details of UnStacked C in the context of the TinyOS operating system for wireless sensor network applications. We present a modified implementation of the TOSThreads library for TinyOS, and show how existing applications programmed using TOSThreads can be automatically transformed to use stackless threads with only modifications in the build process. By eliminating the need to allocate individual thread stacks and by supporting lazy thread preemption, UnStacked C enables a considerable saving of memory used and power consumed, respectively
Stackless Multi-Threading for Embedded Systems
Programming support for multi-threaded applications on embedded microcontroller platforms has attracted a considerable amount of research attention in the recent years. This paper is focused on this problem, and presents UnStacked C, a source-to-source transformation that can translate multithreaded programs written in C into stackless continuations. The transformation can support legacy code by not requiring any changes to application code; only the underlying threading library needs modifications. We describe the details of UnStacked C in the context of the TinyOS operating system for wireless sensor network applications. We present a modified implementation of the TOSThreads library for TinyOS, and show how existing applications programmed using TOSThreads can be automatically transformed to use stackless threads with only modifications in the build process. By eliminating the need to allocate individual thread stacks and by supporting lazy thread preemption, UnStacked C enables a considerable saving of memory used and power consumed, respectively
Towards Implicit Parallel Programming for Systems
Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually.
In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution.
We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
Towards Implicit Parallel Programming for Systems
Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually.
In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution.
We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
- …