387 research outputs found

    Continuation-Passing C: compiling threads to events through continuations

    Get PDF
    In this paper, we introduce Continuation Passing C (CPC), a programming language for concurrent systems in which native and cooperative threads are unified and presented to the programmer as a single abstraction. The CPC compiler uses a compilation technique, based on the CPS transform, that yields efficient code and an extremely lightweight representation for contexts. We provide a proof of the correctness of our compilation scheme. We show in particular that lambda-lifting, a common compilation technique for functional languages, is also correct in an imperative language like C, under some conditions enforced by the CPC compiler. The current CPC compiler is mature enough to write substantial programs such as Hekate, a highly concurrent BitTorrent seeder. Our benchmark results show that CPC is as efficient, while using significantly less space, as the most efficient thread libraries available.Comment: Higher-Order and Symbolic Computation (2012). arXiv admin note: substantial text overlap with arXiv:1202.324

    CPC: programming with a massive number of lightweight threads

    Full text link
    Threads are a convenient and modular abstraction for writing concurrent programs, but often fairly expensive. The standard alternative to threads, event-loop programming, allows much lighter units of concurrency, but leads to code that is difficult to write and even harder to understand. Continuation Passing C (CPC) is a translator that converts a program written in threaded style into a program written with events and native system threads, at the programmer's choice. Together with two undergraduate students, we taught ourselves how to program in CPC by writing Hekate, a massively concurrent network server designed to efficiently handle tens of thousands of simultaneously connected peers. In this paper, we describe a number of programming idioms that we learnt while writing Hekate; while some of these idioms are specific to CPC, many should be applicable to other programming systems with sufficiently cheap threads.Comment: To appear in PLACES'1

    Stackless Multi-Threading for Embedded Systems

    Get PDF
    Programming support for multi-threaded applications on embedded microcontroller platforms has attracted a considerable amount of research attention in the recent years. This paper is focused on this problem, and presents UnStacked C, a source-to-source transformation that can translate multithreaded programs written in C into stackless continuations. The transformation can support legacy code by not requiring any changes to application code; only the underlying threading library needs modifications. We describe the details of UnStacked C in the context of the TinyOS operating system for wireless sensor network applications. We present a modified implementation of the TOSThreads library for TinyOS, and show how existing applications programmed using TOSThreads can be automatically transformed to use stackless threads with only modifications in the build process. By eliminating the need to allocate individual thread stacks and by supporting lazy thread preemption, UnStacked C enables a considerable saving of memory used and power consumed, respectively

    Stackless Multi-Threading for Embedded Systems

    Get PDF
    Programming support for multi-threaded applications on embedded microcontroller platforms has attracted a considerable amount of research attention in the recent years. This paper is focused on this problem, and presents UnStacked C, a source-to-source transformation that can translate multithreaded programs written in C into stackless continuations. The transformation can support legacy code by not requiring any changes to application code; only the underlying threading library needs modifications. We describe the details of UnStacked C in the context of the TinyOS operating system for wireless sensor network applications. We present a modified implementation of the TOSThreads library for TinyOS, and show how existing applications programmed using TOSThreads can be automatically transformed to use stackless threads with only modifications in the build process. By eliminating the need to allocate individual thread stacks and by supporting lazy thread preemption, UnStacked C enables a considerable saving of memory used and power consumed, respectively

    Towards Implicit Parallel Programming for Systems

    Get PDF
    Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually. In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution. We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates

    Towards Implicit Parallel Programming for Systems

    Get PDF
    Multi-core processors require a program to be decomposable into independent parts that can execute in parallel in order to scale performance with the number of cores. But parallel programming is hard especially when the program requires state, which many system programs use for optimization, such as for example a cache to reduce disk I/O. Most prevalent parallel programming models do not support a notion of state and require the programmer to synchronize state access manually, i.e., outside the realms of an associated optimizing compiler. This prevents the compiler to introduce parallelism automatically and requires the programmer to optimize the program manually. In this dissertation, we propose a programming language/compiler co-design to provide a new programming model for implicit parallel programming with state and a compiler that can optimize the program for a parallel execution. We define the notion of a stateful function along with their composition and control structures. An example implementation of a highly scalable server shows that stateful functions smoothly integrate into existing programming language concepts, such as object-oriented programming and programming with structs. Our programming model is also highly practical and allows to gradually adapt existing code bases. As a case study, we implemented a new data processing core for the Hadoop Map/Reduce system to overcome existing performance bottlenecks. Our lambda-calculus-based compiler automatically extracts parallelism without changing the program's semantics. We added further domain-specific semantic-preserving transformations that reduce I/O calls for microservice programs. The runtime format of a program is a dataflow graph that can be executed in parallel, performs concurrent I/O and allows for non-blocking live updates
    • …
    corecore