    Scalable desktop grid system

    Desktop grids are easy to install on large number of personal computers, which is a prerequisite for the spread of grid technology. Current desktop grids connect all PCs into a flat hierarchy, that is, all computers to a central server. SZTAKI Desktop Grid starts from a standalone desktop grid, as a building block. It is extended to include clusters displaying as single powerful PCs, while using their local resource management system. Such building blocks support overtaking additional tasks from other desktop grids, enabling the set-up of a hierarchy. Desktop grids with different owners thus can share resources, although only in a hierarchical structure. This brings desktop grids closer to other grid technologies where sharing resources by several users is the most important feature

    On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

    We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threaded data-transformations for the I/O library ADIOS as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/O latency.Comment: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'1

    Schlafender Riese Kongo-Fluss: Wassernutzung zwischen regionaler Integration und sektoralen Zielkonflikten

    Der Ausbau der Wassernutzung am Kongo könnte der Region einen Entwicklungsschub verschaffen, droht aber mit der Begünstigung partikulärer Nutzungsinteressen einherzugehen. In seinem weitläufigen Einzugsgebiet ist der Fluss das wichtigste Verkehrsnetz und die Lebensader des afrikanischen Regenwalds, der wiederum die Existenzgrundlage von Millionen Menschen sichert. Die Wasser- und Nahrungsmittelversorgung der Region ließe sich mit seinen Ressourcen deutlich verbessern, die Hydroenergiepotentiale könnten den Strombedarf des gesamten Kontinents decken. Der geplante Bau weiterer Großdämme an den Inga-Fällen zeigt, dass die zehn Anrainerstaaten gemeinsame Ziele verfolgen, aber auch, dass sich Konflikte zwischen einzelnen Sektoren verschärfen. Die inkonsistente Haltung Deutschlands in heiklen Grundsatzfragen der Entwicklungszusammenarbeit und Wasseraußenpolitik erschwert es, diese Prozesse konstruktiv zu begleiten. (Autorenreferat

    SZTAKI desktop grid: a modular and scalable way of building large computing grids

    So far BOINC based desktop grid systems have been applied at the global computing level. This paper describes an extended version of BOINC called SZTAKI desktop grid (SZDG) that aims at using desktop grids (DGs) at local (enterprise/institution) level. The novelty of SZDG is that it enables the hierarchical organisation of local DGs, i.e., clients of a DG can be DGs at a lower level that can take work units from their higher level DG server. More than that, even clusters can be connected at the client level and hence work units can contain complete MPI programs to be run on the client clusters. In order to easily create master/worker type DG applications a new API, called as the DC-API has been developed. SZDG and DC-API has been successfully applied both at the global and local level, both in academic institutions and in companies to solve problems requiring large computing power

    Számítóháló alkalmazások teljesítményanalízise és optimalizációja = Performance analysis and optimisation of grid applications

    Számítóhálón (griden) futó alkalmazások, elsősorban workflow-k hatékony végrehajtására kerestünk újszerű megoldásokat a grid teljesítményanalízis és optimalizáció területén. Elkészítettük a Mercury monitort a grid teljesítményanalízis követelményeit figyelembe véve. A párhuzamos programok monitorozására alkalmas GRM monitort integráltuk a relációs adatmodell alapú R-GMA grid információs rendszerrel, illetve a Mercury monitorral. Elkészült a Pulse, és a Prove vizualizációs eszköz grid teljesítményanalízist támogató verziója. Elkészítettünk egy state-of-the-art felmérést grid teljesítményanalízis eszközökről. Kidolgoztuk a P-GRADE rendszer workflow absztrakciós rétegét, melyhez kapcsolódóan elkészült a P-GRADE portál. Ennek segítségével a felhasználók egy web böngészőn keresztül szerkeszthetnek és hajthatnak végre workflow alkalmazásokat számítóhálón. A portál különböző számítóháló implementációkat támogat. Lehetőséget biztosít információ gyűjtésére teljesítményanalízis céljából. Megvizsgáltuk a portál erőforrás brókerekkel való együttműködését, felkészítettük a portált a sikertelen futások javítására. A végrehajtás optimalizálása megkövetelheti az alkalmazás egyes részeinek áthelyezését más erőforrásokra. Ennek támogatására továbbfejlesztettük a P-GRADE alkalmazások naplózhatóságát, és illesztettük a Condor feladatütemezőjéhez. Sikeresen kapcsoltunk a rendszerhez egy terhelés elosztó modult, mely képes a terheltségétől függően áthelyezni a folyamatokat. | We investigated novel approaches for performance analysis and optimization for efficient execution of grid applications, especially workflows. We took into consideration the special requirements of grid performance analysis when elaborated Mercury, a grid monitoring infrastructure. GRM, a performance monitor for parallel applications, has been integrated with R-GMA, a relational grid information system and Mercury as well. We developed Pulse and Prove visualisation tools for supporting grid performance analysis. We wrote a comprehensive state-of-the art survey of grid performance tools. We designed a novel abstraction layer of P-GRADE supporting workflows, and a grid portal. Users can draft and execute workflow applications in the grid via a web browser using the portal. The portal supports multiple grid implementations and provides monitoring capabilities for performance analysis. We tested the integration of the portal with grid resource brokers and also augmented it with some degree of fault-tolerance. Optimization may require the migration of parts of the application to different resources and thus, it requires support for checkpointing. We enhanced the checkpointing facilities of P-GRADE and coupled it to Condor job scheduler. We also extended the system with a load balancer module that is able to migrate processes as part of the optimization

    Runtime I/O Re-Routing + Throttling on HPC Storage

    Abstract Massively parallel storage systems are becoming more and more prevalent on HPC systems due to the emergence of a new generation of data-intensive applications. To achieve the level of I/O throughput and capacity that is demanded by data intensive applications, storage systems typically deploy a large number of storage devices (also known as LUNs or data stores). In doing so, parallel applications are allowed to access storage concurrently, and as a result, the aggregate I/O throughput can be linearly increased with the number of storage devices, reducing the application's end-to-end time. For a production system where storage devices are shared between multiple applications, contention is often a major problem leading to a significant reduction in I/O throughput. In this paper, we describe our efforts to resolve this issue in the context of HPC using a balanced re-routing + throttling approach. The proposed scheme re-routes I/O requests to a less congested storage location in a controlled manner so that write performance is improved while limiting the impact on read

    Unraveling Diffusion in Fusion Plasma: A Case Study of In Situ Processing and Particle Sorting

    This work starts an in situ processing capability to study a certain diffusion process in magnetic confinement fusion. This diffusion process involves plasma particles that are likely to escape confinement. Such particles carry a significant amount of energy from the burning plasma inside the tokamak to the diverter and damaging the diverter plate. This study requires in situ processing because of the fast changing nature of the particle diffusion process. However, the in situ processing approach is challenging because the amount of data to be retained for the diffusion calculations increases over time, unlike in other in situ processing cases where the amount of data to be processed is constant over time. Here we report our preliminary efforts to control the memory usage while ensuring the necessary analysis tasks are completed in a timely manner. Compared with an earlier naive attempt to directly computing the same diffusion displacements in the simulation code, this in situ version reduces the memory usage from particle information by nearly 60% and computation time by about 20%

    MGARD+: Optimizing Multilevel Methods for Error-Bounded Scientific Data Reduction

    Nowadays, data reduction is becoming increasingly important in dealing with the large amounts of scientific data. Existing multilevel compression algorithms offer a promising way to manage scientific data at scale but may suffer from relatively low performance and reduction quality. In this paper, we propose MGARD+, a multilevel data reduction and refactoring framework drawing on previous multilevel methods, to achieve high-performance data decomposition and high-quality error-bounded lossy compression. Our contributions are four-fold: 1) We propose to leverage a level-wise coefficient quantization method, which uses different error tolerances to quantize the multilevel coefficients. 2) We propose an adaptive decomposition method which treats the multilevel decomposition as a preconditioner and terminates the decomposition process at an appropriate level. 3) We leverage a set of algorithmic optimization strategies to significantly improve the performance of multilevel decomposition/recompositing. 4) We evaluate our proposed method using four real-world scientific datasets and compare with several state-of-the-art lossy compressors. Experiments demonstrate that our optimizations improve the decomposition/recompositing performance of the existing multilevel method by up to 70Ă—70 \times70x, and the proposed compression method can improve compression ratio by up to 2Ă—2 \times2x compared with other state-of-the-art error-bounded lossy compressors under the same level of data distortion
