969 research outputs found

    Self-adaptive Grid Resource Monitoring and discovery

    Get PDF
    The Grid provides a novel platform where the scientific and engineering communities can share data and computation across multiple administrative domains. There are several key services that must be offered by Grid middleware; one of them being the Grid Information Service( GIS). A GIS is a Grid middleware component which maintains information about hardware, software, services and people participating in a virtual organisation( VO). There is an inherent need in these systems for the delivery of reliable performance. This thesis describes a number of approaches which detail the development and application of a suite of benchmarks for the prediction of the process of resource discovery and monitoring on the Grid. A series of experimental studies of the characterisation of performance using benchmarking, are carried out. Several novel predictive algorithms are presented and evaluated in terms of their predictive error. Furthermore, predictive methods are developed which describe the behaviour of MDS2 for a variable number of user requests. The MDS is also extended to include job information from a local scheduler; this information is queried using requests of greatly varying complexity. The response of the MDS to these queries is then assessed in terms of several performance metrics. The benchmarking of the dynamic nature of information within MDS3 which is based on the Open Grid Services Architecture (OGSA), and also the successor to MDS2, is also carried out. The performance of both the pull and push query mechanisms is analysed. GridAdapt (Self-adaptive Grid Resource Monitoring) is a new system that is proposed, built upon the Globus MDS3 benchmarking. It offers self-adaptation, autonomy and admission control at the Index Service, whilst ensuring that the MIDS is not overloaded and can meet its quality-of-service,f or example,i n terms of its average response time for servicing synchronous queries and the total number of queries returned per unit time

    Distributed Handler Architecture

    Get PDF
    Thesis (PhD) - Indiana University, Computer Sciences, 2007Over the last couple of decades, distributed systems have been demonstrated an architectural evolvement based on models including client/server, multi-tier, distributed objects, messaging and peer-to-peer. One recent evolutionary step is Service Oriented Architecture (SOA), whose goal is to achieve loose-coupling among the interacting software applications for scalability and interoperability. The SOA model is engendered in Web Services, which provide software platforms to build applications as services and to create seamless and loosely-coupled interactions. Web Services utilize supportive functionalities such as security, reliability, monitoring, logging and so forth. These functionalities are typically provisioned as handlers, which incrementally add new capabilities to the services by building an execution chain. Even though handlers are very important to the service, the way of utilization is very crucial to attain the potential benefits. Every attempt to support a service with an additive functionality increases the chance of having an overwhelmingly crowded chain: this makes Web Service fat. Moreover, a handler may become a bottleneck because of having a comparably higher processing time. In this dissertation, we present Distributed Handler Architecture (DHArch) to provide an efficient, scalable and modular architecture to manage the execution of the handlers. The system distributes the handlers by utilizing a Message Oriented Middleware and orchestrates their execution in an efficient fashion. We also present an empirical evaluation of the system to demonstrate the suitability of this architecture to cope with the issues that exist in the conventional Web Service handler structures

    Self-adaptive Grid Resource Monitoring and discovery

    Get PDF
    The Grid provides a novel platform where the scientific and engineering communities can share data and computation across multiple administrative domains. There are several key services that must be offered by Grid middleware; one of them being the Grid Information Service( GIS). A GIS is a Grid middleware component which maintains information about hardware, software, services and people participating in a virtual organisation( VO). There is an inherent need in these systems for the delivery of reliable performance. This thesis describes a number of approaches which detail the development and application of a suite of benchmarks for the prediction of the process of resource discovery and monitoring on the Grid. A series of experimental studies of the characterisation of performance using benchmarking, are carried out. Several novel predictive algorithms are presented and evaluated in terms of their predictive error. Furthermore, predictive methods are developed which describe the behaviour of MDS2 for a variable number of user requests. The MDS is also extended to include job information from a local scheduler; this information is queried using requests of greatly varying complexity. The response of the MDS to these queries is then assessed in terms of several performance metrics. The benchmarking of the dynamic nature of information within MDS3 which is based on the Open Grid Services Architecture (OGSA), and also the successor to MDS2, is also carried out. The performance of both the pull and push query mechanisms is analysed. GridAdapt (Self-adaptive Grid Resource Monitoring) is a new system that is proposed, built upon the Globus MDS3 benchmarking. It offers self-adaptation, autonomy and admission control at the Index Service, whilst ensuring that the MIDS is not overloaded and can meet its quality-of-service,f or example,i n terms of its average response time for servicing synchronous queries and the total number of queries returned per unit time.EThOS - Electronic Theses Online ServiceUniversity of Warwick (UoW)GBUnited Kingdo

    Scheduling in Grid Computing Environment

    Full text link
    Scheduling in Grid computing has been active area of research since its beginning. However, beginners find very difficult to understand related concepts due to a large learning curve of Grid computing. Thus, there is a need of concise understanding of scheduling in Grid computing area. This paper strives to present concise understanding of scheduling and related understanding of Grid computing system. The paper describes overall picture of Grid computing and discusses important sub-systems that enable Grid computing possible. Moreover, the paper also discusses concepts of resource scheduling and application scheduling and also presents classification of scheduling algorithms. Furthermore, the paper also presents methodology used for evaluating scheduling algorithms including both real system and simulation based approaches. The presented work on scheduling in Grid containing concise understandings of scheduling system, scheduling algorithm, and scheduling methodology would be very useful to users and researchersComment: Fourth International Conference on Advanced Computing & Communication Technologies (ACCT), 201

    Master/worker parallel discrete event simulation

    Get PDF
    The execution of parallel discrete event simulation across metacomputing infrastructures is examined. A master/worker architecture for parallel discrete event simulation is proposed providing robust executions under a dynamic set of services with system-level support for fault tolerance, semi-automated client-directed load balancing, portability across heterogeneous machines, and the ability to run codes on idle or time-sharing clients without significant interaction by users. Research questions and challenges associated with issues and limitations with the work distribution paradigm, targeted computational domain, performance metrics, and the intended class of applications to be used in this context are analyzed and discussed. A portable web services approach to master/worker parallel discrete event simulation is proposed and evaluated with subsequent optimizations to increase the efficiency of large-scale simulation execution through distributed master service design and intrinsic overhead reduction. New techniques for addressing challenges associated with optimistic parallel discrete event simulation across metacomputing such as rollbacks and message unsending with an inherently different computation paradigm utilizing master services and time windows are proposed and examined. Results indicate that a master/worker approach utilizing loosely coupled resources is a viable means for high throughput parallel discrete event simulation by enhancing existing computational capacity or providing alternate execution capability for less time-critical codes.Ph.D.Committee Chair: Fujimoto, Richard; Committee Member: Bader, David; Committee Member: Perumalla, Kalyan; Committee Member: Riley, George; Committee Member: Vuduc, Richar

    An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

    Get PDF
    >Magister Scientiae - MScFunctional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs

    A dynamic prediction and monitoring framework for distributed applications

    Get PDF
    This research builds on an application performance prediction and characterisation environment (known as PACE), whose aim is to characterise the performance-critical elements of both an application and its target execution environment and deduce from this model a predicted behaviour of the application prior to its execution. Underlying the research presented in this thesis are a number of themes: the tasks involved in the performance characterisation of applications and how this might be semi- automated: the level of abstraction at which these characterisations are performed in order to maintain a sufficient predictive accuracy: the automated refinement of these characterisations from runtime performance data: the extension of both the target programming languages and the class of application at which these techniques are aimed. In this thesis a number of novel extensions to PACE are described. These include: a new transaction-based performance characterisation language that provides a flexible framework for describing broader classes of application; a performance monitoring framework (based on an extension to the OpenGroup’s Application Response Measurement (ARM) standard) for the runtime monitoring of an application's data-dependent components and the automated refinement of performance models: an adaptation of this performance characterisation for the prediction of Java applications. These contributions are demonstrated through their application to a number of scientific kernels. This thesis also documents how these predictive results can be used in a real-time distributed runtime management environment, and also how these techniques can be applied to non-scientific codes, in particular to an IBM request-driven distributed web services demonstrator

    A study in grid simulation and scheduling

    Get PDF
    Grid computing is emerging as an essential tool for large scale analysis and problem solving in scientific and business domains. Whilst the idea of stealing unused processor cycles is as old as the Internet, we are still far from reaching a position where many distributed resources can be seamlessly utilised on demand. One major issue preventing this vision is deciding how to effectively manage the remote resources and how to schedule the tasks amongst these resources. This thesis describes an investigation into Grid computing, specifically the problem of Grid scheduling. This complex problem has many unique features making it particularly difficult to solve and as a result many current Grid systems employ simplistic, inefficient solutions. This work describes the development of a simulation tool, G-Sim, which can be used to test the effectiveness of potential Grid scheduling algorithms under realistic operating conditions. This tool is used to analyse the effectiveness of a simple, novel scheduling technique in numerous scenarios. The results are positive and show that it could be applied to current procedures to enhance performance and decrease the negative effect of resource failure. Finally a conversion between the Grid scheduling problem and the classic computing problem SAT is provided. Such a conversion allows for the possibility of applying sophisticated SAT solving procedures to Grid scheduling providing potentially effective solutions
    corecore