442 research outputs found

    Enabling Adaptive Grid Scheduling and Resource Management

    Get PDF
    Wider adoption of the Grid concept has led to an increasing amount of federated computational, storage and visualisation resources being available to scientists and researchers. Distributed and heterogeneous nature of these resources renders most of the legacy cluster monitoring and management approaches inappropriate, and poses new challenges in workflow scheduling on such systems. Effective resource utilisation monitoring and highly granular yet adaptive measurements are prerequisites for a more efficient Grid scheduler. We present a suite of measurement applications able to monitor per-process resource utilisation, and a customisable tool for emulating observed utilisation models. We also outline our future work on a predictive and probabilistic Grid scheduler. The research is undertaken as part of UK e-Science EPSRC sponsored project SO-GRM (Self-Organising Grid Resource Management) in cooperation with BT

    Probabilistic grid scheduling based on job statistics and monitoring information

    Get PDF
    This transfer thesis presents a novel, probabilistic approach to scheduling applications on computational Grids based on their historical behaviour, current state of the Grid and predictions of the future execution times and resource utilisation of such applications. The work lays a foundation for enabling a more intuitive, user-friendly and effective scheduling technique termed deadline scheduling. Initial work has established motivation and requirements for a more efficient Grid scheduler, able to adaptively handle dynamic nature of the Grid resources and submitted workload. Preliminary scheduler research identified the need for a detailed monitoring of Grid resources on the process level, and for a tool to simulate non-deterministic behaviour and statistical properties of Grid applications. A simulation tool, GridLoader, has been developed to enable modelling of application loads similar to a number of typical Grid applications. GridLoader is able to simulate CPU utilisation, memory allocation and network transfers according to limits set through command line parameters or a configuration file. Its specific strength is in achieving set resource utilisation targets in a probabilistic manner, thus creating a dynamic environment, suitable for testing the scheduler’s adaptability and its prediction algorithm. To enable highly granular monitoring of Grid applications, a monitoring framework based on the Ganglia Toolkit was developed and tested. The suite is able to collect resource usage information of individual Grid applications, integrate it into standard XML based information flow, provide visualisation through a Web portal, and export data into a format suitable for off-line analysis. The thesis also presents initial investigation of the utilisation of University College London Central Computing Cluster facility running Sun Grid Engine middleware. Feasibility of basic prediction concepts based on the historical information and process meta-data have been successfully established and possible scheduling improvements using such predictions identified. The thesis is structured as follows: Section 1 introduces Grid computing and its major concepts; Section 2 presents open research issues and specific focus of the author’s research; Section 3 gives a survey of the related literature, schedulers, monitoring tools and simulation packages; Section 4 presents the platform for author’s work – the Self-Organising Grid Resource management project; Sections 5 and 6 give detailed accounts of the monitoring framework and simulation tool developed; Section 7 presents the initial data analysis while Section 8.4 concludes the thesis with appendices and references

    Számítóháló alkalmazások teljesítményanalízise és optimalizációja = Performance analysis and optimisation of grid applications

    Get PDF
    Számítóhálón (griden) futó alkalmazások, elsősorban workflow-k hatékony végrehajtására kerestünk újszerű megoldásokat a grid teljesítményanalízis és optimalizáció területén. Elkészítettük a Mercury monitort a grid teljesítményanalízis követelményeit figyelembe véve. A párhuzamos programok monitorozására alkalmas GRM monitort integráltuk a relációs adatmodell alapú R-GMA grid információs rendszerrel, illetve a Mercury monitorral. Elkészült a Pulse, és a Prove vizualizációs eszköz grid teljesítményanalízist támogató verziója. Elkészítettünk egy state-of-the-art felmérést grid teljesítményanalízis eszközökről. Kidolgoztuk a P-GRADE rendszer workflow absztrakciós rétegét, melyhez kapcsolódóan elkészült a P-GRADE portál. Ennek segítségével a felhasználók egy web böngészőn keresztül szerkeszthetnek és hajthatnak végre workflow alkalmazásokat számítóhálón. A portál különböző számítóháló implementációkat támogat. Lehetőséget biztosít információ gyűjtésére teljesítményanalízis céljából. Megvizsgáltuk a portál erőforrás brókerekkel való együttműködését, felkészítettük a portált a sikertelen futások javítására. A végrehajtás optimalizálása megkövetelheti az alkalmazás egyes részeinek áthelyezését más erőforrásokra. Ennek támogatására továbbfejlesztettük a P-GRADE alkalmazások naplózhatóságát, és illesztettük a Condor feladatütemezőjéhez. Sikeresen kapcsoltunk a rendszerhez egy terhelés elosztó modult, mely képes a terheltségétől függően áthelyezni a folyamatokat. | We investigated novel approaches for performance analysis and optimization for efficient execution of grid applications, especially workflows. We took into consideration the special requirements of grid performance analysis when elaborated Mercury, a grid monitoring infrastructure. GRM, a performance monitor for parallel applications, has been integrated with R-GMA, a relational grid information system and Mercury as well. We developed Pulse and Prove visualisation tools for supporting grid performance analysis. We wrote a comprehensive state-of-the art survey of grid performance tools. We designed a novel abstraction layer of P-GRADE supporting workflows, and a grid portal. Users can draft and execute workflow applications in the grid via a web browser using the portal. The portal supports multiple grid implementations and provides monitoring capabilities for performance analysis. We tested the integration of the portal with grid resource brokers and also augmented it with some degree of fault-tolerance. Optimization may require the migration of parts of the application to different resources and thus, it requires support for checkpointing. We enhanced the checkpointing facilities of P-GRADE and coupled it to Condor job scheduler. We also extended the system with a load balancer module that is able to migrate processes as part of the optimization

    Self-organising management of Grid environments

    Get PDF
    This paper presents basic concepts, architectural principles and algorithms for efficient resource and security management in cluster computing environments and the Grid. The work presented in this paper is funded by BTExacT and the EPSRC project SO-GRM (GR/S21939)

    Adaptive Grid Scheduling and Resource Management

    Get PDF
    • …
    corecore