2,867 research outputs found

    Garbage collection auto-tuning for Java MapReduce on Multi-Cores

    Get PDF
    MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a four-core, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impact that Java runtime garbage collection has on the performance and scalability of MRJ. We propose the use of memory management auto-tuning techniques based on machine learning. With our auto-tuning approach, we are able to achieve MRJ performance within 10% of optimal on 75% of our benchmark tests

    Formal Derivation of Concurrent Garbage Collectors

    Get PDF
    Concurrent garbage collectors are notoriously difficult to implement correctly. Previous approaches to the issue of producing correct collectors have mainly been based on posit-and-prove verification or on the application of domain-specific templates and transformations. We show how to derive the upper reaches of a family of concurrent garbage collectors by refinement from a formal specification, emphasizing the application of domain-independent design theories and transformations. A key contribution is an extension to the classical lattice-theoretic fixpoint theorems to account for the dynamics of concurrent mutation and collection.Comment: 38 pages, 21 figures. The short version of this paper appeared in the Proceedings of MPC 201

    Subheap-Augmented Garbage Collection

    Get PDF
    Automated memory management avoids the tedium and danger of manual techniques. However, as no programmer input is required, no widely available interface exists to permit principled control over sometimes unacceptable performance costs. This dissertation explores the idea that performance-oriented languages should give programmers greater control over where and when the garbage collector (GC) expends effort. We describe an interface and implementation to expose heap partitioning and collection decisions without compromising type safety. We show that our interface allows the programmer to encode a form of reference counting using Hayes\u27 notion of key objects. Preliminary experimental data suggests that our proposed mechanism can avoid high overheads suffered by tracing collectors in some scenarios, especially with tight heaps. However, for other applications, the costs of applying subheaps---in human effort and runtime overheads---remain daunting

    Da'wah bi al-Hal in Empowering Campus-Assisted Community through Waste Bank Management

    Get PDF
    As an institution of education and agent of social change, the Faculty of Da'wah, IAIN Salatiga, has Islamic religious characteristics. These characteristics confirm its missionary effort as a part of social responsibility. The Da'wah Faculty pursues empowerment by rolling out a program of waste bank management. This program is a manifestation of the application of daā€™wah bi al-hal (preaching by actuating) to build assisted-communities who are empowered and independent in processing garbage. Furthermore, the program of waste bank management not only provides clean and healthy communities but also increases the communityā€™s ability to gain environmental benefits and improving its economy. This study aims to analyze the implementation of daā€™wah movement through waste bank management, from assessment, recycling management, construction, to evaluation. The method of research in this study uses a qualitative method, Participatory Action Research (PAR) to make changes and benefits for the community, and empowerment theory as the mainframe of thought in carrying out daā€™wah bi al-hal. The results of participatory action research showed that preaching by actuating of Daā€™wah Faculty IAIN Salatiga in the assisted-communities has increased participation, trust, and cooperative relationships among them in managing the waste bank. Fakultas Dakwah IAIN Salatiga sebagai lembaga pemikiran dan agen perubahan sosial, memiliki karakteristik keagamaan Islam. Karakteristik ini meneguhkan upaya berdakwah sebagai tanggung jawab sosialnya. Dalam melakukan dakwah di masyarakat, Fakultas Dakwah menempuh cara pemberdayaan dengan menggulirkan program pengelolaan bank sampah. Hal ini sebagai wujud aplikasi dakwah bi al-hal untuk membangun masyarakat binaan yang berdaya dan mandiri dalam mengolah sampah. Program pengelolaan bank sampah tidak hanya memberikan manfaat lingkungan bersih dan sehat, tetapi juga meningkatkan kemampuan masyarakat untuk mengolah sampah dalam meningkatkan perekonomian. Penelitian ini bertujuan untuk menganalisis tentang implementasi dakwah bi al-hal dengan program pengelolaan bank sampah itu berjalan mulai assesment, manajemen daur ulang, pembangunan sarpras dan evaluasi. Metode dalam penelitian ini menggunakan metode kualitatif Participatory Action Research (PAR) untuk melakukan perubahan dan manfaat bagi masyarakat dengan teori pemberdayaan sebagai kerangka berfikir utama dalam melakukan dakwah bi al-hal. Hasil penelitian aksi partisipatif memperlihatkan bahwa sentuhan pendampingan yang telah dilakukan Fakultas Dakwah IAIN Salatiga terhadap masyarakat binaannya dapat meningkatkan partisipasi,Ā  kepercayaan dan hubungan kerja sama diantara mereka dalam pengelolaan bank sampah

    Three pitfalls in Java performance evaluation

    Get PDF
    The Java programming language has known a remarkable growth over the last decade. This is partially due to the infrastructure required to run Java ap- plications on general purpose microprocessors: a Java virtual machine (VM). The VM ensures that Java applications are portable across different hardware platforms, because it shelters the applications from the underlying system. Hence the motto write once, run (almost) anywhere. Java applications are compiled to an intermediate form, called bytecode, and consist of a number of so-called class files. The virtual machine takes care of class loading, interpreting or compiling the bytecode to the native code of the underlying hardware platform, thread scheduling, garbage collection, etc. As such, during the execution of a Java application, the VM regularly intervenes to take care of housekeeping tasks and to optimise the application as it is executing. Furthermore, the specific implementation details of most virtual machines insert non-deterministic behaviour, not into the semantic part of the execution, but rather into the lower level execution. For example, to bring a Java application up to competitive speed with classical compiled programs written in languages such as C, the virtual machine needs to optimise Java bytecode. To limit the execution overhead, most virtual machines use a time sampling mechanism to determine the hot methods in the application. This introduces non-determinism, as over several runs, the methods are not always optimised at the same moment, nor is the set of optimised methods always the same. Other factors that introduce non-determinism are the thread scheduling, garbage collection, etc. It is readily seen that performance analysis of Java applications is not as simple as it seems at first, and warrants closer inspection. In this dissertation we are mainly interested in the behaviour of Java applications and their performance. In the course of this work, we uncovered three major pitfalls that were not taken into account by researchers when analysing Java performance prior to this work. We will briefly summarise the main achievements presented in this dissertation. The first pitfall we present involves the interaction between the virtual machine, the application and the input to the application. The performance for short running applications is shown to be mainly determined by the virtual machine. For longer running applications, this influence decreases, but remains tangible. We use statistical analysis, such as principal components analysis and cluster analysis (K-means and hierarchical clustering) to demonstrate and clarify the pitfall. By means of a large number of performance char- acteristics measured using hardware performance counters, five virtual machines and fourteen benchmarks with both a small and a large input size, we demonstrate that short running workloads are primarily clustered by virtual machines. Even for long running applications from the SPECjvm98 benchmark suite, the virtual machine still exerts a large influence on the observed behaviour at the microarchitectural level. This work has shown the need for both larger and longer running benchmarks than were available prior to it ā€“ this was (partially) met by the introduction of the DaCapo benchmark suite ā€“ as well as a careful consideration when setting up an experiment to avoid measuring the virtual machine, rather than the benchmark. Prior to this work, people were quite often using simulation with short running applications (to save time) for exploring Java performance. The second pitfall we uncover involves the analysis of performance numbers. During a survey of 50 papers published at premier conferences, such as OOPSLA, PLDI, CGO, ISMM and VEE, over the past seven years, we found that a variety of approaches are used, both for experimental design ā€“ for example, the input size, virtual machines, heap sizes, etc. ā€“ and, even more importantly, for data analysis ā€“ for example, using a best out of 3 performance number. New techniques are pitted against existing work using these prevalent approaches, and conclusions regarding their successfulness in beating prior state-of-the-art are based upon them. Given the fact that the execution of Java applications usually involves non-determinism in the virtual machine ā€“ for example, when determining which methods to optimise ā€“ it should come as no surprise that the lack of statistical rigour in these prevalent approaches leads to misleading or even incorrect conclusions. By this we mean that the conclusions are either not representative of what actually happens, or even contradict reality, as modelled in a statistical manner. To circumvent this pitfall, we propose a rigorous statistical approach that uses confidence intervals to both report and compare performance numbers. We also claim that sufficient experiments should be conducted to get a reliable performance measure. The non-determinism caused by the timer-based optimisation component in a virtual machine can be eliminated using so-called replay compilation. This technique will record a compilation plan during a first execution or profiling run of the application. During a second execution, the application is iterated twice: once to compile and optimise all methods found in the compilation plan, and a second time to perform the actual measurement. It turns out however that current practice of using either a single plan ā€“ corresponding to the best performing profiling run ā€“ or a combined plan choosing the methods that were optimised in, say, more than half the profiling runs, is no match for using multiple plans. The variability observed in the plans themselves is too large to capture in one of the current practices. Consequently, using multiple plans is definitely the better option. Moreover, this allows using a matched-pair approach in the data analysis, which results in tighter confidence intervals for the mean performance number. The third pitfall we examine is the usage of global performance numbers when tuning either an application or a virtual machine. We show that Java applications exhibit phase behaviour at the method level. This means that instances of the same method show more similarity to each other, behaviourwise, than to instances of other methods. A phase can then be identified as a set of sub-trees of the dynamic call-tree, with each sub-tree headed by the same method. We present an two-step algorithm that allows correlating hardware performance counter data in step 2 with the phases determined in step 1. The information obtained can be applied to show the programmer which methods perform worse than average, for example with respect to the number of cache misses they incur. In the dissertation, we pay particular attention to statistical rigour. For each pitfall, we use statistics to demonstrate its presence. Hopefully this work will encourage other researchers to use more rigour in their work as well

    Exploration of Dynamic Memory

    Get PDF
    Since the advent of the Java programming language and the development of real-time garbage collection, Java has become an option for implementing real-time applications. The memory management choices provided by real-time garbage collection allow for real-time eJava developers to spend more of their time implementing real-time solutions. Unfortunately, the real-time community is not convinced that real-time garbage collection works in managing memory for Java applications deployed in a real-time context. Consequently, the Real-Time for Java Expert Group formulated the Real-Time Speciļ¬cation for Java (RTSJ) standards to make Java a real-time programming language. In lieu of garbage collection, the RTSJ proposed a new memory model called scopes, and a new type of thread called NoHeapRealTimeThread (NHRT), which takes advantage of scopes. While scopes and NHRTs promise predictable allocation and deallocation behaviors, no asymptotic studies have been conducted to investigate the costs associated with these technologies. To understand the costs associated with using these technologies to manage memory, computations and analyses of time and space overheads associated with scopes and NHRTs are presented. These results provide a framework for comparing the RTSJā€™s memory management model with real-time garbage collection. Another facet of this research concerns the optimization of novel approaches to garbage collection on multiprocessor systems. Such approaches yield features that are suitable for real-time systems. Although multiprocessor, concurrent garbage collection is not the same as real-time garbage collection, advancements in multiprocessor concurrent garbage collection have demonstrated the feasibility of building low latency multiprocessor real-time garbage collectors. In the nineteen-sixties, only three garbage collection schemes were available, namely reference counting garbage collection, mark-sweep garbage collection, and copying garbage collection. These classical approaches gave new insight into the discipline of memory management and inspired researchers to develop new, more elaborate memory-management techniques. Those insights resulted in a plethora of automatic memory management algorithms and techniques, and a lack of uniformity in the language used to reason about garbage collection. To bring a sense of uniformity to the language used to reason about garbage collection technologies, a taxonomy for comparing garbage collection technologies is presented

    Twelve factors influencing sustainable recycling of municipal solid waste in developing countries

    Get PDF
    Sustainable management of solid waste is a global concern, as exemplified by the United Nations Millennium Development Goals (MDG) that 191 member states support. The seventh MDG indirectly advocates for municipal solid waste management (MSWM) by aiming to ensure environmental sustainability into countriesā€™ policies and programs and reverse negative environmental impact. Proper MSWM will likely result in relieving poverty, reducing child mortality, improving maternal health, and preventing disease, which are MDG goals one, four, five, and six, respectively (UNMDG, 2005). Solid waste production is increasing worldwide as the global society strives to obtain a decent quality of life. Several means exist in which the amount of solid waste going to a landfill can be reduced, such as incineration with energy production, composting of organic wastes, and material recovery through recycling, which are all considered sustainable methods by which to manage MSW. In the developing world, composting is already a widely-accepted method to reduce waste fated for the landfill, and incineration for energy recovery can be a costly capital investment for most communities. Therefore, this research focuses on recycling as a solution to the municipal solid waste production problem while considering the three dimensions of sustainability environment, society, and economy. First, twenty-three developing country case studies were quantitatively and qualitatively examined for aspects of municipal solid waste management. The municipal solid waste (MSW) generation and recovery rates, as well as the composition were compiled and assessed. The average MSW generation rate was 0.77 kg/person/day, with recovery rates varying from 5 ā€“ 40%. The waste streams of nineteen of these case studies consisted of 0 ā€“ 70% recyclable material and 17 ā€“ 80% organic material. All twenty-three case studies were analyzed qualitatively by identifying any barriers or incentives to recycling, which justified the creation of twelve factors influencing sustainable municipal solid waste management (MSWM) in developing countries. The presence of regulations, enforcement of laws, and use of incentive schemes constitutes the first factor, Government Policy. Cost of MSWM operations, the budget allocated to MSWM by local to national governments, as well as the stability and reliability of funds comprise the Government Finances factor influencing recycling in the third world. Many case studies indicated that understanding features of a waste stream such as the generation and recovery rates and composition is the first measure in determining proper management solutions, which forms the third factor Waste Characterization. The presence and efficiency of waste collection and segregation by scavengers, municipalities, or private contractors was commonly addressed by the case studies, which justified Waste Collection and Segregation as the fourth factor. Having knowledge of MSWM and an understanding of the linkages between human behavior, waste handling, and health/sanitation/environment comprise the Household Education factor. Individualsā€™ income influencing waste handling behavior (e.g., reuse, recycling, and illegal dumping), presence of waste collection/disposal fees, and willingness to pay by residents were seen as one of the biggest incentives to recycling, which justified them being combined into the Household Economics factor. The MSWM Administration factor was formed following several references to the presence and effectiveness of private and/or public management of waste through collection, recovery, and disposal influencing recycling activity. Although the MSWM Personnel Education factor was only recognized by six of the twenty-two case studies, the lack of trained laborers and skilled professionals in MSWM positions was a barrier to sustainable MSWM in every case but one. The presence and effectiveness of a comprehensive, integrative, long-term MSWM strategy was highly encouraged by every case study that addressed the tenth factor, MSWM Plan. Although seemingly a subset of private MSWM administration, the existence and profitability of market systems relying on recycled-material throughput, involvement of small businesses, middlemen, and large industries/exporters is deserving of the factor Local Recycled-Material Market. Availability and effective use of technology and/or human workforce and the safety considerations of each were recurrent barriers and incentives to recycling to warrant the Technological and Human Resources factor. The Land Availability factor takes into consideration land attributes such as terrain, ownership, and development which can often times dictate MSWM. Understanding the relationships among the twelve factors influencing recycling in developing countries, made apparent the collaborative nature required of sustainable MSWM. Factors requiring the greatest collaborative inputs include waste collection and segregation, MSWM plan, and local recycled-material market. Aligning each factor to the societal, environmental, and economic dimensions of sustainability revealed the motives behind the institutions contributing to each factor. A correlation between stakeholder involvement and sustainability existed, as supported by the fact that the only three factors driven by all three dimensions of sustainability were the same three that required the greatest collaboration with other factors. With increasing urbanization, advocating for improved health for all through the MDG, and changing consumption patterns resulting in increasing and more complex waste streams, the utilization of the collaboration web offered by this research is ever needed in the developing world. Through its use, the institutions associated with each of the twelve factors can achieve a better understanding of the collaboration necessary and beneficial for more sustainable MSWM
    • ā€¦
    corecore