    The ISQoS Grid Broker for Temporal and Budget Guarantees

    We introduce our Grid broker that uses SLAs in job submission with the aim of ensuring jobs are computed on time and on budget. We demonstrate our broker's ability to perform negotiation and to select preferentially higher priority jobs, in a tender market and discuss the architecture that makes this possible. We additionally show the effects of rescheduling and how careful consideration is required in order to avoid price instability. We therefore make recommendations upon how to maintain this stability, given rescheduling

    A job response time prediction method for production Grid computing environments

    A major obstacle to the widespread adoption of Grid Computing in both the scientific community and industry sector is the difficulty of knowing in advance a job submission running cost that can be used to plan a correct allocation of resources. Traditional distributed computing solutions take advantage of homogeneous and open environments to propose prediction methods that use a detailed analysis of the hardware and software components. However, production Grid computing environments, which are large and use a complex and dynamic set of resources, present a different challenge. In Grid computing the source code of applications, programme libraries, and third-party software are not always available. In addition, Grid security policies may not agree to run hardware or software analysis tools to generate Grid components models. The objective of this research is the prediction of a job response time in production Grid computing environments. The solution is inspired by the concept of predicting future Grid behaviours based on previous experiences learned from heterogeneous Grid workload trace data. The research objective was selected with the aim of improving the Grid resource usability and the administration of Grid environments. The predicted data can be used to allocate resources in advance and inform forecasted finishing time and running costs before submission. The proposed Grid Computing Response Time Prediction (GRTP) method implements several internal stages where the workload traces are mined to produce a response time prediction for a given job. In addition, the GRTP method assesses the predicted result against the actual target job’s response time to inference information that is used to tune the methods setting parameters. The GRTP method was implemented and tested using a cross-validation technique to assess how the proposed solution generalises to independent data sets. The training set was taken from the Grid environment DAS (Distributed ASCI Supercomputer). The two testing sets were taken from AuverGrid and Grid5000 Grid environments Three consecutive tests assuming stable jobs, unstable jobs, and using a job type method to select the most appropriate prediction function were carried out. The tests offered a significant increase in prediction performance for data mining based methods applied in Grid computing environments. For instance, in Grid5000 the GRTP method answered 77 percent of job prediction requests with an error of less than 10 percent. While in the same environment, the most effective and accurate method using workload traces was only able to predict 32 percent of the cases within the same range of error. The GRTP method was able to handle unexpected changes in resources and services which affect the job response time trends and was able to adapt to new scenarios. The tests showed that the proposed GRTP method is capable of predicting job response time requests and it also improves the prediction quality when compared to other current solutions

    A generic scheduling architecture for service oriented distributed computing infrastructures

    In state-of-the-art distributed computing infrastructures different kinds of resources are combined to offer complex services to customers. As of today, service-oriented middleware stacks are the work-horses to connect resources and their users, and to implement all functions needed to provide those services. Analysing the functionality of prominent middleware stacks, it becomes evident that common challenges, like scalability, manageability, efficiency, reliability, security, or complexity, exist, and that they constitute major research areas in information and communication technologies in general and distributed systems in particular. One core issue, touching all of the aforementioned challenges, is the question of how to distribute units of work in a distributed computing infrastructure, a task generally referred to as scheduling. Integrating a variety of resources and services while being compliant with well-defined business objectives makes the development of scheduling strategies and services a difficult venture, which, for service-oriented distributed computing infrastructures, translates to the assignment of services to activities over time aiming at the optimisation of multiple, potentially competing, quality-of-service criteria. Many concepts, methods, and tools for scheduling in distributed computing infrastructures exist, a majority of which being dedicated to provide algorithmic solutions and schedulers. We approach the problem from another angle and offer a more general answer to the question of ’how to design an automated scheduling process and an architecture supporting it’. Doing so, we take special care of the service-oriented nature of the systems we consider and of the integration of our solutions into IT service management processes. Our answer comprises a number of assets that form a comprehensive scheduling solution for distributed computing infrastructures. Based on a requirement analysis of application scenarios we provide a concept consisting of an automated scheduling process and the respective generic scheduling architecture supporting it. Process and architecture are based on four core models as there are a model to describe the activities to be executed, an information model to capture the capabilities of the infrastructure, a model to handle the life-cycle of service level agreements, which are the foundation for elaborated service management solutions, and a specific scheduling model capturing the specifics of state-of-the-art distributed systems. We deliver, in addition to concept and models, realisations of our solutions that demonstrate their applicability in different application scenarios spanning grid-like academic as well as financial service infrastructures. Last, but not least, we evaluate our scheduling model through simulations of artificial as well as realistic workload traces thus showing the feasibility of the approach and the implications of its usage. The work at hand therefore offers a blueprint for developers of scheduling solutions for state-of-the-art distributed computing infrastructures. It contributes essential building blocks to realise such solutions and provides an important step to integrate them into IT service management solutions

    Negotiated resource brokering for quality of service provision of grid applications

    Grid Computing is a distributed computing paradigm where many computers often formed from different organisations work together so that their computing power may be aggregated. Grids are often heterogeneous and resources vary significantly in CPU power, available RAM, disk space, OS, architecture and installed software etc. Added to this lack of uniformity is that best effort services are usually offered, as opposed to services that offer guarantees upon completion time via the use of Service Level Agreements (SLAs). The lack of guarantees means the uptake of Grids is stifled. The challenge tackled here is to add such guarantees, thus ensuring users are more willing to use the Grid given an obvious reluctance to pay or contribute, if the quality of the services returned lacks any guarantees. Grids resources are also finite in nature, hence priorities need establishing in order to best meet any guarantees placed upon the limited resources available. An economic approach is hence adopted to ensure end users reveal their true priorities for jobs, whilst also adding incentive for provisioning services, via a service charge. An economically oriented model is therefore proposed that provides SLAs with bicriteria constraints upon time and cost. This model is tested via discrete event simulation and a simulator is presented that is capable of testing the model. An architecture is then established that was developed to utilise the economic model for negotiating SLAs. Finally experimentation is reported upon from the use of the software developed when it was deployed upon a testbed, including admission control and steering of jobs within the Grid. Results are presented that show the interactions and relationship between the time and cost constraints within the model, including transitions between the dominance of one constraint over the other and other things such as the effects of rescheduling upon the market

    Orchestration of resources in distributed, heterogeneous grid environments using dynamic service level agreements

    Die Akzeptanz des Internets und der zunehmende Ausbau von Netzwerkkapazitäten ermöglichen bereits heute einen effizienten und zuverlässigen Austausch riesiger Datenmengen zwischen verschiedenen Rechensystemen weltweit. Hieraus resultieren neue Paradigmen bei der Bereit-stellung und Nutzung verteilter IT-Ressourcen wie zum Beispiel das Grid-Computing. Im Grid-Computing werden Rechenressourcen verschiedener Institutionen bzw. Organisationen koordiniert zur Lösung wissenschaftlicher und wirtschaftlicher Problemstellungen genutzt. Neben Rechenressourcen werden dabei auch Daten, Datenspeicher oder Software bereitgestellt. Die Qualität mit der diese Ressourcen bereitgestellt werden gewinnt dabei zunehmend an Bedeutung. Qualitätseigenschaften sind zum Beispiel die minimale Verfügbarkeit von Rechenressourcen, die maximale Zugriffszeit eines Datenspeichers oder die maximale Antwortzeit einer web-basierten Anwendung. Für Ressourcenanbieter bedeutet dies dass spezifische Prozesse implementiert werden müssen um qualitativ hochwertige IT-Dienste bereitzustellen. Zudem können Dienste mit unterschiedlichen Dienstqualitäten bereitgestellt werden, wobei Dienste mit geringerer Qualität preiswerter angeboten werden als solche mit hoher Qualität. Anwender hingegen können den für sie passenden Dienst hinsichtlich ihrer Anforderungen und ihres Budgets auswählen. Service Level Agreements (SLAs) sind ein akzeptierter Ansatz um Verträge über IT-Dienste und Dienstqualitäten zu realisieren. SLAs beschreiben sowohl die funktionalen als auch die nicht-funktionalen Anforderungen von IT-Diensten als auch Vergütung und Strafen für Erfüllung bzw. Nichterfüllung der definierten Anforderungen. Diese Arbeit behandelt Methoden zur Verhandlung und Verwaltung von dynamischen SLAs in verteilten Systemen auf Basis des WS-Agreement Standards. Im Fokus steht hierbei die Deklaration von SLAs, deren automatisierte Verhandlung und Erstellung, das Monitoring von SLA Garantien, sowie die Verwendung von SLAs zur koordinierten Nutzung von IT-Ressourcen. Zu diesem Zweck wurde aufbauend auf die WS-Agreement Spezifikation ein Protokoll zur dynamischen Verhandlung bzw. Neuverhandlung von SLAs entwickelt. Dies beinhaltet die Definition eines Verhandlungsmodells zum Austausch von Angeboten zwischen den Verhandlungspartnern. Die anschließende Erstellung der SLAs basiert auf dem WS-Agreement Standard stellt einen automatisierter Prozess dar. Da es sich bei SLAs um elektronische Verträge handelt wurden Mechanismen zur Validierung von SLA Angeboten entwickelt und im Detail vorgestellt. Darüber hinaus werden Methoden zur automatisierten Evaluation von SLA Garantien beschrieben. Abschließend wird die Architektur und Implementierung eines Orchestrierungsdienstes zur Co-Allokation beliebiger Ressource wie z.B. Rechen- und Netzwerkressourcen vorgestellt. Die Ressourcenorchestrierung wurde hierbei mittels SLAs realisiert.In recent decades the acceptance of the internet and the increase of network capacity have resulted in a situation in which it is now possible to transfer huge amounts of data efficiently and reliably between different computing systems worldwide. This enables new paradigms in provision and use of distributed IT resources. Grid computing is such a well-known paradigm where computing resources owned by various institutions and organizations are used in a coordinated way in order to solve scientific and economic problems. Besides computing resources also data, storage or software resources are provided. Today it becomes more and more important with which quality the different resources are provided. This may be, for example, the minimal availability of computing resources, the maximum access time of a data storage or the maximum response time of a web-based application. Offering resources with a defined quality means for resource providers that they need to implement specific processes to assert the quality of the provisioning process. On the other hand, resource providers can offer their services at different quality levels. Services with a lower quality can be offered cheaper than those with a higher quality. Service consumers can therefore select the service with the appropriate service level in terms of their requirements and budget. This provides both parties, service provider and consumer, with more flexibility during the service provisioning process. Service level agreements (SLAs) are an accepted approach to realize contracts for IT services and service qualities. They describe the functional and the non-functional requirements of IT services. Additionally, they define compensation and penalties for delivering services with the defined requirements respectively for failing to meet these quality criteria. This thesis examines methods for negotiation and management of SLAs in distributed systems based on the WS-Agreement standard. The focus is on methods for SLA declaration, automated SLA negotiation and creation processes, monitoring of SLA guarantees, and the application of SLAs for coordinated IT resource provisioning. Therefore, a protocol for dynamic negotiation or renegotiation of SLAs is developed as an extension to the WS-Agreement specification. This includes the definition of a negotiation model for the exchange of offers between the negotiating partners. The subsequent SLA creation process is an automated process in distributed systems. Since SLAs are a kind of electronic contracts a mechanism for validating the integrity of SLA offers was developed and is presented in detail. In addition, automatic methods for SLA guarantee evaluation are described. Finally, an orchestration service for co-allocating arbitrary resources such as computing and network resources is presented. The resource orchestration process has been realized using SLAs. The architecture of this service is evaluated and based on the evaluation result an advanced orchestration service architecture is conceived

    Multi-Agent Systems

    A multi-agent system (MAS) is a system composed of multiple interacting intelligent agents. Multi-agent systems can be used to solve problems which are difficult or impossible for an individual agent or monolithic system to solve. Agent systems are open and extensible systems that allow for the deployment of autonomous and proactive software components. Multi-agent systems have been brought up and used in several application domains

    Dezentrales grid scheduling mittels computational intelligence

    Das ständig wachsende Bedürfnis nach universell verfügbarer Rechen- und Speicherkapazität wird durch die in den letzten Jahren vorangetriebene Entwicklung neuer Architekturen für die vernetzte Interaktion zwischen Nutzern und Anbietern von Rechenressourcen mehr und mehr erfüllt. Dabei ist die Umsetzung einer Infrastruktur zur koordinierten Nutzung global verteilter Rechenressourcen längst in Forschung und Wirtschaft realisiert worden. Diese als Grid-Computing bezeichnete Infrastruktur wird künftig integraler Bestandteil der globalen Ressourcenlandschaft sein, sodass die Ausführung von lokal eingereichten Berechnungsaufgaben nicht mehr ortsgebunden ist, sondern flexibel zwischen unterschiedlichen Ressourcenanbietern migriert werden kann. Bereits heute sind unterschiedliche Nutzergemeinschaften im Rahmen von Community-Grids in virtuellen Organisationen zusammengefasst, die neben einem gemeinsamen Forschungs- oder Anwendungsinteresse auch häufig eine Menge von IT-Ressourcen gemeinsam nutzen. Ziel ist es aber, auf lange Sicht eine Community-übergreifende Kooperation im Sinne einer globalen Grid-Infrastruktur unter Wahrung lokaler Autonomie weiter zu fördern. Dabei bringt die Interaktion mit anderen Communities im Grid sowohl Chancen als auch Herausforderungen mit sich, da durch die Nutzbarkeit global verteilter Ressourcen auch höhere Anforderungen in Bezug auf Berechnungsgeschwindigkeit und Wartezeiten von Seiten der Nutzer gestellt werden. Der Schlüssel für den effizienten Betrieb künftiger Computational Grids liegt daher in der Entwicklung tragfähiger Architekturen und Strategien für das Scheduling, also in der Zuteilung der Jobs zu den Ressourcen. Bisher sind die Methoden für die Verhandlung von Jobübernahmen zwischen Ressourcenanbietern jedoch nur sehr rudimentär entwickelt. In dieser Arbeit werden deshalb dezentrale Schedulingstrategien für Computational Grids entwickelt und unter Einsatz von Methoden der Computational Intelligence realisiert und optimiert. Dabei werden einzelne virtuelle Organisationen als autonome Einheiten betrachtet, die über eine Annahme oder Abgabe sowohl von eigenen als auch von extern eingereichten Jobs entscheiden. Durch die Beachtung einer restriktiven Informationspolitik werden die Autorität und Sicherheit virtueller Organisationen gewahrt und zugleich wird die Skalierbarkeit in größeren Umgebungen durch den dezentralen Aufbau sichergestellt. Zunächst werden verschiedene dezentrale Strategien entwickelt und simulatorisch untersucht. Die Ergebnisse geben dann Aufschlüsse über die Dynamik und Eigenschaften eines derartigen Verbunds. Auf Basis der so gewonnenen Erkenntnisse werden die Mechanismen zur Entscheidungsfindung verfeinert und in einer neu entworfenen modularen Schedulingarchitektur umgesetzt. Mittels evolutionär optimierter Fuzzy-Systeme wird anschließend die Entscheidungsfindung optimiert. Die Interaktion zwischen virtuellen Organisationen wird dann alternativ mittels co-evolutionärer Algorithmen angepasst. Die auf Basis realer Arbeitslastaufzeichnungen durchgeführten Evaluationen zeigen, dass die so erstellten Grid-Schedulingstrategien für alle am Grid teilnehmenden Communities deutlich verkürzte Antwortzeiten für die jeweiligen Nutzergemeinschaften erreichen. Gleichzeitig wird eine große Robustheit der Verfahren sowohl gegenüber veränderlichen Grid-Umgebungen als auch gegenüber verändertem Nutzerverhalten bewiesen. Die Ergebnisse sind als Motivation für die stärkere Community-übergreifende Kooperation im Sinne eines Computational Grid zu sehen, da dies bei Nutzung entsprechend optimierter Verfahren in einer Win-win Situation für alle Teilnehmer resultiert.The ever-growing need for universally available computing and storage capacity is more and more satisfied by new architectures for networked interaction between users and providers of computing resources. Today, research and industry have realized an infrastructure for the coordinated use of globally distributed computing resources. The so-called grid computing infrastructure is assumed to be an integral part of the future global resource landscapes. In such an environment, submitted computing jobs are no longer bound locally, but can be flexibly migrated between different resource providers. In the context of community grids, different users are organized into virtual organizations, which typically share---in addition to a joint research---various computing resources. However, cooperation among different virtual organizations at the moment occurs only very rarely as---besides technical issues---this requires more advanced decentralized grid scheduling concepts. In order to fully utilize the capabilities of a federated grid, it is essential to promote a cross-community cooperation in the sense of a global grid infrastructure. At the same time, however, it is most important that the local autonomy is maintained, as no virtual organization would voluntarily cede the control of their local resources. Thus, the interaction among different communities in the grid brings both opportunities and challenges: The newly formed flexibility makes users even more demanding with respect to computing result delivery and wait times. The efficient operation of future computational grids highly depends on the development of viable architectures and strategies for scheduling, i.e. the allocation of jobs to resources and powerful methods for the negotiation of job migrations. In this work, we therefore develop distributed scheduling strategies for computational grids using various methods from computational intelligence. Different virtual organizations are seen as autonomous entities that decide on the acceptance or decline of jobs. Here, jobs can be offered by other schedulers or by the local user communities. The authority and security of virtual organizations will be maintained by following a restrictive information policy that strongly limits the exchange of system state information. The fully decentralized grid structure guarantees that the scheduling concepts are also applicable in large-scale environments. Initially, several decentralized strategies are developed and tested by extensive simulations with real-world workload traces. The results give first insights into the dynamics and characteristics of the assumed grid federations. Based on these findings, the mechanisms for decision-making is refined and implemented in a newly designed modular scheduling architecture. Using an evolutionary fuzzy system, the decision-making and interaction between virtual organizations is realized and further optimized. In a last step, also a coevolutionary algorithm is applied to improve the scheduling decisions. The evaluation based on real workload recordings reveals that it is possible to achieve significantly shorter response times for all respective user communities. At the same time, we demonstrate a strong robustness of the procedures, both to changing grid environments and changed user behavior. The results can be seen as a motivation for the increased cross-community cooperation in terms of a global computational grid, as this results in a win-win situation for all participants

    Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers

    CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities. The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers. A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI