9 research outputs found

    A Toolkit for Simulation of Desktop Grid Environment

    Get PDF
    Peer to Peers, clusters and grids enable a combination of heterogeneous distributed recourses to resolve problems in different fields such as science, engineering and commerce. Organizations within the world wide grid environment network are offering geographically distributed resources which are administrated by schedulers and policies. Studying the resources behavior is time consuming due to their unique behavior and uniqueness. In this type of environment it is nearly impossible to prove the effectiveness of a scheduling algorithm. Hence the main objective of this study is to develop a desktop grid simulator toolkit for measuring and modeling scheduler algorithm performance. The selected methodology for the application development is based on prototyping methodology. The prototypes will be developed using JAVA language united with a MySQL database. Core functionality of the simulator are job generation, volunteer generation, simulating algorithms, generating graphical charts and generating reports. A simulator for desktop grid environment has been developed using Java as the implementation language due to its wide popularity. The final system has been developed after a successful delivery of two prototypes. Despite the implementation of the mentioned core functionalities of a desktop grid simulator, advanced features such as viewing real-time graphical charts, generating PDF reports of the simulation result and exporting the final result as CSV files has been also included among the other features

    A Toolkit for Simulation of Desktop Grid Environment

    Get PDF
    Peer to Peers, clusters and grids enable a combination of heterogeneous distributed recourses to resolve problems in different fields such as science, engineering and commerce. Organizations within the world wide grid environment network are offering geographically distributed resources which are administrated by schedulers and policies. Studying the resources behavior is time consuming due to their unique behavior and uniqueness. In this type of environment it is nearly impossible to prove the effectiveness of a scheduling algorithm. Hence the main objective of this study is to develop a desktop grid simulator toolkit for measuring and modeling scheduler algorithm performance. The selected methodology for the application development is based on prototyping methodology. The prototypes will be developed using JAVA language united with a MySQL database. Core functionality of the simulator are job generation, volunteer generation, simulating algorithms, generating graphical charts and generating reports. A simulator for desktop grid environment has been developed using Java as the implementation language due to its wide popularity. The final system has been developed after a successful delivery of two prototypes. Despite the implementation of the mentioned core functionalities of a desktop grid simulator, advanced features such as viewing real-time graphical charts, generating PDF reports of the simulation result and exporting the final result as CSV files has been also included among the other features

    Automatic Methods for Predicting Machine Availability in Desktop Grid and Peer-to-peer Systems

    No full text
    In this paper, we examine the problem of predicting machine availability in desktop and enterprise computing environments. Predicting the duration that a machine will run until it restarts (availability duration) is critically useful to application scheduling and resource characterization in federated systems. We describe one parametric model fitting technique and two non-parametric prediction techniques, comparing their accuracy in predicting the quantiles of empirically observed machine availability distributions. We describe each method analytically and evaluate its precision using a synthetic trace of machine availability constructed from a known distribution. To detail their practical efficacy, we apply them to machine availability traces from three separate desktop and enterprise computing environments, and evaluate each method in terms of the accuracy with which it predicts availability in a trace driven simulation. Our results indicate that availability duration can be predicted with quantifiable confidence bounds and that these bounds can be used as conservative bounds on lifetime predictions. Moreover, a non-parametric method based on a binomial approach generates the most accurate estimates

    Decentralized Resource Availability Prediction in Peer-to-Peer Desktop Grids

    Get PDF
    Grid computing is a form of distributed computing which is used by an organiza­ tion to handle its long-running computational tasks. Volunteer computing (desktop grid) is a type of grid computing that uses idle CPU cycles donated voluntarily by users, to run its tasks. In a desktop grid model, the resources are not dedicated. The job (computational task) is submitted for execution in the resource only when the resource is idle. There is no guarantee that the job which has started to execute in a resource will complete its execution without any disruption from user activity (such as keyboard click or mouse move). This problem becomes more challenging in a Peer-to-Peer (P2P) model of desktop grids where there is no central server which takes the decision on whether to allocate a job to a resource. In this thesis we propose and implement a P2P desktop grid framework which does resource availability prediction. We try to improve the predictability of the system, by submitting the jobs on machines which have a higher probability of being available at a given time. We benchmark our framework and provide an analysis of our results

    Enhancing reliability with Latin Square redundancy on desktop grids.

    Get PDF
    Computational grids are some of the largest computer systems in existence today. Unfortunately they are also, in many cases, the least reliable. This research examines the use of redundancy with permutation as a method of improving reliability in computational grid applications. Three primary avenues are explored - development of a new redundancy model, the Replication and Permutation Paradigm (RPP) for computational grids, development of grid simulation software for testing RPP against other redundancy methods and, finally, running a program on a live grid using RPP. An important part of RPP involves distributing data and tasks across the grid in Latin Square fashion. Two theorems and subsequent proofs regarding Latin Squares are developed. The theorems describe the changing position of symbols between the rows of a standard Latin Square. When a symbol is missing because a column is removed the theorems provide a basis for determining the next row and column where the missing symbol can be found. Interesting in their own right, the theorems have implications for redundancy. In terms of the redundancy model, the theorems allow one to state the maximum makespan in the face of missing computational hosts when using Latin Square redundancy. The simulator software was developed and used to compare different data and task distribution schemes on a simulated grid. The software clearly showed the advantage of running RPP, which resulted in faster completion times in the face of computational host failures. The Latin Square method also fails gracefully in that jobs complete with massive node failure while increasing makespan. Finally an Inductive Logic Program (ILP) for pharmacophore search was executed, using a Latin Square redundancy methodology, on a Condor grid in the Dahlem Lab at the University of Louisville Speed School of Engineering. All jobs completed, even in the face of large numbers of randomly generated computational host failures

    Prediction of available computing capacities for a more efficient use of Grid resources

    Get PDF
    Vor allem in der Forschung und in den Entwicklungsabteilungen von Unternehmen gibt es eine Vielzahl von Problemen, welche nur mit Programmen zu lösen sind, für deren Ausführung die zur Verfügung stehende Rechenleistung kaum groß genug sein kann. Gleichzeitig ist zu beobachten, dass ein großer Teil der mit der installierten Rechentechnik vorhandenen Rechenkapazität nicht ausgenutzt wird. Dies gilt insbesondere für Einzelrechner, die in Büros, Computer-Pools oder Privathaushalten stehen und sogar während ihrer eigentlichen Nutzung selten ausgelastet sind. Eines der Ziele des Grid-Computings besteht darin, solche nicht ausgelasteten Ressourcen für rechenintensive Anwendungen zur Verfügung zu stellen. Die eigentliche Motivation für die beabsichtigte bessere Auslastung der Ressourcen liegt dabei nicht primär in der höhreren Auslastung, sondern in einer möglichen Einsparung von Kosten gegenüber der Alternative der Neuanschaffung weiterer Hardware. Ein erster Beitrag der vorliegenden Arbeit liegt in der Analyse und Quantifizierung dieses möglichen Kostenvorteils. Zu diesem Zweck werden die relevanten Kosten betrachtet und schließlich verschiedene Szenarien miteinander verglichen. Die Analyse wird schließlich konkrete Zahlen zu den Kosten in den verschiedenen Szenarien liefern und somit das mögliche Potential zur Kosteneinsparung bei der Nutzung brach liegender Rechenkapazitäten aufzeigen. Ein wesentliches Problem beim Grid-Computing besteht jedoch (vor allem bei der Nutzung von Einzelrechnern zur Ausführung länger laufender Programme) darin, dass die zur Verfügung stehenden freien Rechenkapazitäten im zeitlichen Verlauf stark schwanken und Berechnungsfortschritte durch plötzliche anderweitige Verwendung bzw. durch Abschalten der Rechner verloren gehen. Um dennoch auch Einzelrechner sinnvoll für die Ausführung länger laufender Jobs nutzen zu können, wären Vorhersagen der in der nächsten Zeit zu erwartenden freien Rechenkapazitäten wünschenswert. Solche Vorhersagen könnten u. a. hilfreich sein für das Scheduling und für die Bestimmung geeigneter Checkpoint-Zeitpunkte. Für die genannten Anwendungszwecke sind dabei Punktvorhersagen (wie z. B. Vorhersagen des Erwartungswertes) nur bedingt hilfreich, weshalb sich die vorliegende Arbeit ausschließlich mit Vorhersagen der Wahrscheinlichkeitsverteilungen beschäftigt. Wie solche Vorhersagen erstellt werden sollen, ist Gegenstand der restlichen Arbeit. Dabei werden zunächst Möglichkeiten der Bewertung von Prognoseverfahren diskutiert, die Wahrscheinlichkeitsverteilungen vorhersagen. Es werden wesentliche Probleme bisheriger Bewertungsverfahren aufgezeigt und entsprechende Lösungsvorschläge gemacht. Unter Nutzung dieser werden in der Literatur zu findende und auch neue Vorgehensweisen zur Prognoseerstellung empirisch miteinander verglichen. Es wird sich zeigen, dass eine der neu entwickelten Vorgehensweisen im Vergleich zu bisher in der Literatur dargestellten Vorhersageverfahren klare Vorteile bzgl. der Genauigkeit der Prognosen erzielt.Although computer hardware is getting faster and faster, the available computing capacity is not satisfying for all problem types. Especially in research and development departments the demand for computing power is nearly unlimited. On the same time, there are really large amounts of computing capacities being idle. Such idle capacities can be found in computer pools, on office workstations, or even on home PCs, which are rarely fully utilized. Consequently, one of the goals of the so called “grid computing” is the use of underutilized resources for the execution of compute-intensive tasks. The original motivation behind this idea is not primarily the high utilization of all resources. Instead, the goal is a reduction of costs in comparison to classical usage scenarios. Hence, a first contribution of the thesis at hand is the analysis of the potential cost advantage. The analysis quantifies the relevant cost factors and compares different usage scenarios. It finally delivers tangible figures about the arising costs and, consequently, also about the potential cost savings when using underutilized resources. However, the realization of the potential cost savings is hampered by the variability of the available computing capacities. The progress of a computational process can be slowed down or even lost by sudden increments of the resource utilization or (even worse) by shutdowns or crashes. Obviously, accurate predictions of the future available computing capacities could alleviate the mentioned problem. Such predictions were useful for several purposes (e.g. scheduling or optimization of checkpoint intervals), whereas in most cases the prediction of a single value (for example the expectancy) is only of limited value. Therefore, the work at hand examines predictions of probability distributions. First, the problem of the assessment of different prediction methods is extensively discussed. The main problems of existing assessment criteria are clearly identified, and more useful criteria are proposed. Second, the problem of the prediction itself is analyzed. For this purpose, conventional methods as described in the literature are examined and finally enhanced. The modified methods are then compared to the conventional methods by using the proposed assessment criteria and data from real world computers. The results clearly show the advantage of the methods proposed in the thesis at hand

    Using Workload Prediction and Federation to Increase Cloud Utilization

    Get PDF
    The wide-spread adoption of cloud computing has changed how large-scale computing infrastructure is built and managed. Infrastructure-as-a-Service (IaaS) clouds consolidate different separate workloads onto a shared platform and provide a consistent quality of service by overprovisioning capacity. This additional capacity, however, remains idle for extended periods of time and represents a drag on system efficiency.The smaller scale of private IaaS clouds compared to public clouds exacerbates overprovisioning inefficiencies as opportunities for workload consolidation in private clouds are limited. Federation and cycle harvesting capabilities from computational grids help to improve efficiency, but to date have seen only limited adoption in the cloud due to a fundamental mismatch between the usage models of grids and clouds. Computational grids provide high throughput of queued batch jobs on a best-effort basis and enforce user priorities through dynamic job preemption, while IaaS clouds provide immediate feedback to user requests and make ahead-of-time guarantees about resource availability.We present a novel method to enable workload federation across IaaS clouds that overcomes this mismatch between grid and cloud usage models and improves system efficiency while also offering availability guarantees. We develop a new method for faster-than-realtime simulation of IaaS clouds to make predictions about system utilization and leverage this method to estimate the future availability of preemptible resources in the cloud. We then use these estimates to perform careful admission control and provide ahead-of-time bounds on the preemption probability of federated jobs executing on preemptible resources. Finally, we build an end-to-end prototype that addresses practical issues of workload federation and evaluate the prototype's efficacy using real-world traces from big data and compute-intensive production workloads
    corecore