2,392 research outputs found

    PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development

    Full text link
    This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/function dispatch to the JVM. This hybrid approach---declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small---results in a system that is ideal for the development of reusable, data-intensive tools and libraries. Through extensive benchmarking, we show that implementing complex objects manipulation and non-trivial, library-style computations on top of PlinyCompute can result in a speedup of 2x to more than 50x or more compared to equivalent implementations on Spark.Comment: 48 pages, including references and Appendi

    Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures

    Get PDF
    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current implementations assume a single address space shared memory. We investigate and extend SVP to handle distributed environments, and discuss a prototype SVP implementation which transparently supports execution on heterogeneous distributed memory clusters over TCP/IP connections, while retaining the original SVP programming model

    Experiences Implementing Efficient Java Thread Serialization, Mobility and Persistence

    Get PDF
    Today, mobility and persistence are important aspects of distributed computing- . They have many fields of use such as load balancing, fault tolerance and dynamic reconfiguration of applications. In this context, Java provides many useful mechanisms for the mobility of code via dynamic class loading, and the mobility or persistence of data via object serialization. However, Java does not provide any mechanism for the mobility/persistence of computation (i.e., threads). We designed and implemented a new mechanism, called , that is used to build thread mobility or thread persistence. Therefore, a running Java thread can, at an arbitrary state of its execution, migrate to a remote machine where it resumes its execution, or be checkpointed on disk for possible subsequent recovery. With our services, migrating a thread is simply performed by the call of our primitive, and checkpointing/recovering a thread is performed by the call of our and primitives. Several projects have recently addressed the issue of Java thread serialization, e.g., Sumatra, Wasp, JavaGo, Brakes, JavaGoX, Merpati. Some of them have attempted to minimize the overhead incurred by the thread serialization mechanism on thread performance, but none of them has been able to completely avoid this overhead. We propose a generic Java thread serialization mechanism that does not impose any performance overhead on serialized threads. This is achieved thanks to the use of type inference and dynamic de-optimization techniques. In this paper, we describe the design and implementation details of our thread serialization prototype in Sun Microsystems' JDK. We report on experiments conducted with our prototype, present a comparative performance evaluation of the main thread serialization techniques, and confirm the elimination of the performance overhead with our thread serialization mechanism. Document type: Repor
    • …
    corecore