34 research outputs found

    CAWL: A Cache-aware Write Performance Model of Linux Systems

    Full text link
    The performance of data intensive applications is often dominated by their input/output (I/O) operations but the I/O stack of systems is complex and severely depends on system specific settings and hardware components. This situation makes generic performance optimisation challenging and costly for developers as they would have to run their application on a large variety of systems to evaluate their improvements. Here, simulation frameworks can help reducing the experimental overhead but they typically handle the topic of I/O rather coarse-grained, which leads to significant inaccuracies in performance predictions. Here, we propose a more accurate model of the write performance of Linux-based systems that takes different I/O methods and levels (via system calls, library calls, direct or indirect, etc.), the page cache, background writing, and the I/O throttling capabilities of the Linux kernel into account. With our model, we reduce, for example, the relative prediction error compared to a standard I/O model included in SimGrid for a random I/O scenario from 67 % down to 10 % relative error against real measurements of the simulated workload. In other scenarios the differences are even more pronounced.Comment: 22 pages, 9 figures, 1 tabl

    C3Grid als Werkzeug für das Datenmanagement in der Klimaforschung

    Get PDF
    Im C3Grid wird ein zentrales Datenmanagement eingesetzt, um die Datenbestände von verteilten Archiven zu verwalten. In einem kollaborativen Workspace können die Daten unabhängig von ihrem Speicherort vom Nutzer bearbeitet werden. Mit seiner Hilfe wird auch eine Brücke zu Datenknoten des Earth System Grid geschlagen, in denen sich die Daten des CMIP5/IPCC AR5 befinden

    On the Cost of Reliability in Large Data Grids

    Get PDF
    Global grid environments do not only provide massive aggregated computing power but also an unprecedented amount of distributed storage space. Unfortunately, dynamic changes caused by component failures, local decisions, and irregular data updates make it difficult to efficiently use this capacity. In this paper, we address the problem of improving data availability in the presence of unreliable components. We present an analytical model for determining an optimal combination of distributed replica catalogs, catalog sizes, and replica servers. Empirical simulation results confirm the accuracy of our theoretical analysis. Our model captures the characteristics of highly dynamic environments like peer-to-peer networks, but it can also be applied to more centralized, less dynamic grid environments like the European DataGrid

    Executing and observing CFD applications on the Grid

    No full text
    We present the FlowGrid system, that allows Computational Fluid Dynamics (CFD) simulations to be executed in Grid environments. Using this system, users can observe online the progress of their simulation by looking at intermediate results, that are visualized in the graphical user interface. Several Grid centers across Europe currently use and validate the system with their CFD computations and build a ‘CFD Virtual Organization ’ to share their resources and balance their processing load. We first describe the overall FlowGrid architecture, highlight its special features and present the system along a typical job execution. The Grid infrastructure, i.e. FlowServe, is presented in detail, a description of the accounting system is given and experiences with the FlowGrid testbed are provided. Finally, we provide evidence that the results can be used as a generic CFD Grid service. © 2004 Elsevier B.V. All rights reserved

    P2P Routing of Range Queries in Skewed Multidimensional Data Sets ⋆

    No full text
    Abstract. We present a middleware to store multidimensional data sets on Internet-scale distributed systems and to efficiently perform range queries on them. Our structured overlay network SONAR (Structured Overlay Network with Arbitrary Range queries) puts keys which are adjacent in the key space on logically adjacent nodes in the overlay and is thereby able to process multidimensional range queries with a single logarithmic data lookup and local forwarding. The specified ranges may have arbitrary shapes like rectangles, circles, spheres or polygons. Empirical results demonstrate the routing performance of SONAR on several data sets, ranging from real-world data to artificially constructed worst case distributions. We study the quality of SONAR’s routing structure which is based on local knowledge only and measure the indegree of the overlay nodes to find potential hot spots in the overlay. We show that SONAR’s routing table is self-adjusting, even under extreme situations, keeping always a maximum of ⌈log N ⌉ routing entries. Key words: structured overlays, range queries, routing, multidimensional data set

    Grid-Enabled Computational Fluid Dynamics using FlowGrid

    No full text
    We present an architecture for Computational Fluid Dynamics (CFD) applications, that we developed for the FlowGrid project. FlowGrid revolutionizes the way CFD simulations are set up, executed and monitored. In this project several Grid centers across Europe develop and validate their software for Grid-based CFD computations. The 'CFD Virtual Organization' of FlowGrid provides industrial end users easy and flexible access to CFD resources
    corecore