1,511 research outputs found

    A Compiler and Runtime Infrastructure for Automatic Program Distribution

    Get PDF
    This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that enables experimentation with various program partitioning and mapping strategies and the study of automatic distribution's effect on resource consumption (e.g., CPU, memory, communication). Since many optimization techniques are faced with conflicting optimization targets (e.g., memory and communication), we believe that it is important to be able to study their interaction. We present a set of techniques that enable flexible resource modeling and program distribution. These are: dependence analysis, weighted graph partitioning, code and communication generation, and profiling. We have developed these ideas in the context of the Java language. We present in detail the design and implementation of each of the techniques as part of our compiler and runtime infrastructure. Then, we evaluate our design and present preliminary experimental data for each component, as well as for the entire system

    Garbage collection auto-tuning for Java MapReduce on Multi-Cores

    Get PDF
    MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a four-core, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impact that Java runtime garbage collection has on the performance and scalability of MRJ. We propose the use of memory management auto-tuning techniques based on machine learning. With our auto-tuning approach, we are able to achieve MRJ performance within 10% of optimal on 75% of our benchmark tests

    Towards co-designed optimizations in parallel frameworks: A MapReduce case study

    Full text link
    The explosion of Big Data was followed by the proliferation of numerous complex parallel software stacks whose aim is to tackle the challenges of data deluge. A drawback of a such multi-layered hierarchical deployment is the inability to maintain and delegate vital semantic information between layers in the stack. Software abstractions increase the semantic distance between an application and its generated code. However, parallel software frameworks contain inherent semantic information that general purpose compilers are not designed to exploit. This paper presents a case study demonstrating how the specific semantic information of the MapReduce paradigm can be exploited on multicore architectures. MR4J has been implemented in Java and evaluated against hand-optimized C and C++ equivalents. The initial observed results led to the design of a semantically aware optimizer that runs automatically without requiring modification to application code. The optimizer is able to speedup the execution time of MR4J by up to 2.0x. The introduced optimization not only improves the performance of the generated code, during the map phase, but also reduces the pressure on the garbage collector. This demonstrates how semantic information can be harnessed without sacrificing sound software engineering practices when using parallel software frameworks.Comment: 8 page

    Relating Static and Dynamic Measurements for the Java Virtual Machine Instruction Set

    Get PDF
    It has previously been noted that, for conventional machine code, there is a strong relationship between static and dynamic code measurements. One of the goals of this paper is to examine whether this same relationship is true of Java programs at the bytecode level. To this end, the hypothesis of a linear correlation between static and dynamic frequencies was investigated using Pearson’s correlation coefficient. Programs from the Java Grande and SPEC benchmarks suites were used in the analysis

    Relating Static and Dynamic Measurements for the Java Virtual Machine Instruction Set

    Get PDF
    It has previously been noted that, for conventional machine code, there is a strong relationship between static and dynamic code measurements. One of the goals of this paper is to examine whether this same relationship is true of Java programs at the bytecode level. To this end, the hypothesis of a linear correlation between static and dynamic frequencies was investigated using Pearson’s correlation coefficient. Programs from the Java Grande and SPEC benchmarks suites were used in the analysis

    Java instrumentation suite: accurate analysis of Java threaded applications

    Get PDF
    The rapid maturing process of the Java technology is encouraging users the development of portable applications using the Java language. As an important part of the definition of the Java language, the use of threads is becoming commonplace when programming this kind of applications. Understanding and tuning threaded applications requires the use of effective tools for detecting possible performance bottlenecks. Most of the available tools summarize the behavior of the application in a global way offering different metrics that are sufficient to optimize the performance of the application in some cases. However, they do not enable a detailed analysis of the behavior of the application; this requires the use of tools that perform an exhaustive and time-aware tracing at a fine-grain level. This paper presents the Java Instrumentation Suite (JIS), a set of tools designed to instrument Java threaded applications using dynamic code interposition (avoiding the instrumentation and recompilation of the source code and/or the Java Virtual Machine JVM). Our initial implementation targets the JVM version 3.1.1 on top of the SGI Origin2000 parallel platform. The paper describes the design of JIS and highlights some of its main functionalities specifically designed to understand the behavior of Java threaded applications and the JVM itself, and to speed them up.Postprint (author’s final draft

    Introducing concurrency in sequential Java via laws

    Get PDF
    AbstractNowadays multi-core processors can be found everywhere. It is well known that one way of improving performance is by parallelization. In this paper we propose a parallelization strategy for Java using algebraic laws. We perform an experiment with two benchmarks and show that our strategy produces a gain similar to a specialized parallel version provided by the Java Grande Benchmark (JGB)
    corecore