3,522 research outputs found

    Performance Evaluation of Barrier Techniques for Distributed Tracing Garbage Collector

    Get PDF
    Currently, software engineering is becoming even more complex due to distributed computing. In this new context, portability is one of the key issues and hence a cluster-aware Java Virtual Machine (JVM) that can transparently execute Java applications in a distributed fashion on nodes of a cluster, while providing the programmer with the single system image of a classical JVM, is really desirable. This way multi-threaded server applications can take advantage of cluster resources without increasing their programming complexity. However, such kind of JVM is not easy to design. Moreover, one of the most challenging tasks in its design is the development of an efficient, scalable and automatic dynamic memory manager. Inside this manager, one important module is the automatic recycling mechanism or garbage collector. This collector is a module with very intensive processing demands that must concurrently run with user’s application. It can consume a very significant portion of the total execution time spent inside JVM in uniprocessor systems, and its overhead increases in distributed garbage collection because of the update of changing references in different nodes. Hence, the garbage collector is a very critical part in distributed designs of JVMs, both for performance and energy. In this paper our contribution to automatic distributed garbage collection is two-fold. First, we have analyzed the barrier mechanism design space for the study of tracing-based distributed garbage collectors. Second, we have evaluated the impact of the most significative barrier strategies as main bottlenecks in global performance. Our preliminary results show that the choice of the specific technique used in barrier mechanisms produces significant differences both in performance and internodes messaging overhead

    A scalable application server on Beowulf clusters : a thesis presented in partial fulfilment of the requirement for the degree of Master of Information Science at Albany, Auckland, Massey University, New Zealand

    Get PDF
    Application performance and scalability of a large distributed multi-tiered application is a core requirement for most of today's critical business applications. I have investigated the scalability of a J2EE application server using the standard ECperf benchmark application in the Massey Beowulf Clusters namely the Sisters and the Helix. My testing environment consists of Open Source software: The integrated JBoss-Tomcat as the application server and the web server, along with PostgreSQL as the database. My testing programs were run on the clustered application server, which provide replication of the Enterprise Java Bean (EJB) objects. I have completed various centralized and distributed tests using the JBoss Cluster. I concluded that clustering of the application server and web server will effectively increase the performance of the application running on them given sufficient system resources. The application performance will scale to a point where a bottleneck has occurred in the testing system, the bottleneck could be any resources included in the testing environment: the hardware, software, network and the application that is running. Performance tuning for a large-scale J2EE application is a complicated issue, which is related to the resources available. However, by carefully identifying the performance bottleneck in the system with hardware, software, network, operating system and application configuration. I can improve the performance of the J2EE applications running in a Beowulf Cluster. The software bottleneck can be solved by changing the default settings, on the other hand, hardware bottlenecks are harder unless more investment are made to purchase higher speed and capacity hardware

    Parallel and distributed Gr\"obner bases computation in JAS

    Full text link
    This paper considers parallel Gr\"obner bases algorithms on distributed memory parallel computers with multi-core compute nodes. We summarize three different Gr\"obner bases implementations: shared memory parallel, pure distributed memory parallel and distributed memory combined with shared memory parallelism. The last algorithm, called distributed hybrid, uses only one control communication channel between the master node and the worker nodes and keeps polynomials in shared memory on a node. The polynomials are transported asynchronous to the control-flow of the algorithm in a separate distributed data structure. The implementation is generic and works for all implemented (exact) fields. We present new performance measurements and discuss the performance of the algorithms.Comment: 14 pages, 8 tables, 13 figure
    • …
    corecore