Search CORE

34 research outputs found

CAWL: A Cache-aware Write Performance Model of Linux Systems

Author: Gholami Masoud
Schintke Florian
Publication venue
Publication date: 09/06/2023
Field of study

The performance of data intensive applications is often dominated by their input/output (I/O) operations but the I/O stack of systems is complex and severely depends on system specific settings and hardware components. This situation makes generic performance optimisation challenging and costly for developers as they would have to run their application on a large variety of systems to evaluate their improvements. Here, simulation frameworks can help reducing the experimental overhead but they typically handle the topic of I/O rather coarse-grained, which leads to significant inaccuracies in performance predictions. Here, we propose a more accurate model of the write performance of Linux-based systems that takes different I/O methods and levels (via system calls, library calls, direct or indirect, etc.), the page cache, background writing, and the I/O throttling capabilities of the Linux kernel into account. With our model, we reduce, for example, the relative prediction error compared to a standard I/O model included in SimGrid for a random I/O scenario from 67 % down to 10 % relative error against real measurements of the simulated workload. In other scenarios the differences are even more pronounced.Comment: 22 pages, 9 figures, 1 tabl

arXiv.org e-Print Archive

Autonomic Management of Large Clusters and Their Integration into the Grid

We present a framework for the co-ordinated, autonomic management of multiple clusters in a compute center and their integration into a Grid environment. Site autonomy and the automation of administrative tasks are prime aspects in this framework. The system behavior is continuously monitored in a steering cycle and appropriate actions are taken to resolve any problems. All presented components have been implemented in the course of the EU project DataGrid: The Lemon monitoring components, the FT fault-tolerance mechanism, the quattor system for software installation and configuration, the RMS job and resource management system, and the Gridification scheme that integrates clusters into the Grid

Open Access Repository

C3Grid als Werkzeug für das Datenmanagement in der Klimaforschung

Author: Fritzsch Bernadette
Kindermann Stephan
Schintke Florian
Publication venue
Publication date: 01/03/2012
Field of study

Im C3Grid wird ein zentrales Datenmanagement eingesetzt, um die Datenbestände von verteilten Archiven zu verwalten. In einem kollaborativen Workspace können die Daten unabhängig von ihrem Speicherort vom Nutzer bearbeitet werden. Mit seiner Hilfe wird auch eine Brücke zu Datenknoten des Earth System Grid geschlagen, in denen sich die Daten des CMIP5/IPCC AR5 befinden

Electronic Publication Information Center

On the Cost of Reliability in Large Data Grids

Author: Alexander Reinefeld
Florian Schintke
Publication venue
Publication date: 01/01/2002
Field of study

Global grid environments do not only provide massive aggregated computing power but also an unprecedented amount of distributed storage space. Unfortunately, dynamic changes caused by component failures, local decisions, and irregular data updates make it difficult to efficiently use this capacity. In this paper, we address the problem of improving data availability in the presence of unreliable components. We present an analytical model for determining an optimal combination of distributed replica catalogs, catalog sizes, and replica servers. Empirical simulation results confirm the accuracy of our theoretical analysis. Our model captures the characteristics of highly dynamic environments like peer-to-peer networks, but it can also be applied to more centralized, less dynamic grid environments like the European DataGrid

CiteSeerX

Executing and observing CFD applications on the Grid

Author: Florian Schintke
Jan Wendler
Publication venue
Publication date
Field of study

We present the FlowGrid system, that allows Computational Fluid Dynamics (CFD) simulations to be executed in Grid environments. Using this system, users can observe online the progress of their simulation by looking at intermediate results, that are visualized in the graphical user interface. Several Grid centers across Europe currently use and validate the system with their CFD computations and build a ‘CFD Virtual Organization ’ to share their resources and balance their processing load. We first describe the overall FlowGrid architecture, highlight its special features and present the system along a typical job execution. The Grid infrastructure, i.e. FlowServe, is presented in detail, a description of the accounting system is given and experiences with the FlowGrid testbed are provided. Finally, we provide evidence that the results can be used as a generic CFD Grid service. © 2004 Elsevier B.V. All rights reserved

CiteSeerX

P2P Routing of Range Queries in Skewed Multidimensional Data Sets ⋆

Author: Alexander Reinefeld
Er Reinefeld
Florian Schintke
Florian Schintke
Thorsten Schütt
Thorsten Schütt
Publication venue
Publication date
Field of study

Abstract. We present a middleware to store multidimensional data sets on Internet-scale distributed systems and to efficiently perform range queries on them. Our structured overlay network SONAR (Structured Overlay Network with Arbitrary Range queries) puts keys which are adjacent in the key space on logically adjacent nodes in the overlay and is thereby able to process multidimensional range queries with a single logarithmic data lookup and local forwarding. The specified ranges may have arbitrary shapes like rectangles, circles, spheres or polygons. Empirical results demonstrate the routing performance of SONAR on several data sets, ranging from real-world data to artificially constructed worst case distributions. We study the quality of SONAR’s routing structure which is based on local knowledge only and measure the indegree of the overlay nodes to find potential hot spots in the overlay. We show that SONAR’s routing table is self-adjusting, even under extreme situations, keeping always a maximum of ⌈log N ⌉ routing entries. Key words: structured overlays, range queries, routing, multidimensional data set

CiteSeerX

Grid-Enabled Computational Fluid Dynamics using FlowGrid

Author: Florian Schintke
Jan Wendler
Publication venue
Publication date
Field of study

We present an architecture for Computational Fluid Dynamics (CFD) applications, that we developed for the FlowGrid project. FlowGrid revolutionizes the way CFD simulations are set up, executed and monitored. In this project several Grid centers across Europe develop and validate their software for Grid-based CFD computations. The 'CFD Virtual Organization' of FlowGrid provides industrial end users easy and flexible access to CFD resources

CiteSeerX