Search CORE

202 research outputs found

Autonomic Management using Self-Stabilization for Hierarchical and Distributed Middleware

Author: Caron Eddy
Faye Maurice-Djibril
Thiare Ousmane
Publication venue: IEEE Computer Society Press
Publication date: 26/10/2015
Field of study

International audienceDynamic nature of distributed architecture is a major challenge to avail the benefits of distributed computing. An effective solution to deal with this dynamic nature is to implement a self-adaptive mechanism to sustain the distributed architecture. Self-adaptive systems can autonomously modify their behavior at run-time in response to changes in their environment. This capability may be included in the software systems at design time or later by external mechanisms. Our paper describes the self- adaptive algorithm that we developed for an existing middleware. Once the middleware is deployed, it can detect a set of events which indicate an unstable deployment state. When an event is detected, some instructions are executed to handle the event. We have designed a simulator to have a deeper insights of our proposed self-adaptive algorithm. Results of our simulated experiments validate the safe convergence of the algorithm

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

1st Workshop on Model-driven Software Adaptation : M-ADAPT'07 at ECOOP 2007 (Proceedings)

Author
Publication venue
Publication date: 01/01/2007
Field of study

DepositOnce

Monitoring and Optimization of ATLAS Tier 2 Center GoeGrid

Author: Magradze Erekle
Publication venue: University Goettingen Repository
Publication date: 11/01/2016
Field of study

The demand on computational and storage resources is growing along with the amount of infor- mation that needs to be processed and preserved. In order to ease the provisioning of the digital services to the growing number of consumers, more and more distributed computing systems and platforms are actively developed and employed. The building block of the distributed computing infrastructure are single computing centers, similar to the Worldwide LHC Computing Grid, Tier 2 centre GoeGrid. The main motivation of this thesis was the optimization of GoeGrid perfor- mance by efficient monitoring. The goal has been achieved by means of the GoeGrid monitoring information analysis. The data analysis approach was based on the adaptive-network-based fuzzy inference system (ANFIS) and machine learning algorithm such as Linear Support Vector Machine (SVM). The main object of the research was the digital service, since availability, reliability and ser- viceability of the computing platform can be measured according to the constant and stable provisioning of the services. Due to the widely used concept of the service oriented architecture (SOA) for large computing facilities, in advance knowing of the service state as well as the quick and accurate detection of its disability allows to perform the proactive management of the com- puting facility. The proactive management is considered as a core component of the computing facility management automation concept, such as Autonomic Computing. Thus in time as well as in advance and accurate identification of the provided service status can be considered as a contribution to the computing facility management automation, which is directly related to the provisioning of the stable and reliable computing resources. Based on the case studies, performed using the GoeGrid monitoring data, consideration of the approaches as generalized methods for the accurate and fast identification and prediction of the service status is reasonable. Simplicity and low consumption of the computing resources allow to consider the methods in the scope of the Autonomic Computing component

Georg-August-University Göttingen

CERN Document Server

Engineering Self-Adaptive Collective Processes for Cyber-Physical Ecosystems

Author: Casadei Roberto <1990>
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 02/04/2020
Field of study

The pervasiveness of computing and networking is creating significant opportunities for building valuable socio-technical systems. However, the scale, density, heterogeneity, interdependence, and QoS constraints of many target systems pose severe operational and engineering challenges. Beyond individual smart devices, cyber-physical collectives can provide services or solve complex problems by leveraging a “system effect” while coordinating and adapting to context or environment change. Understanding and building systems exhibiting collective intelligence and autonomic capabilities represent a prominent research goal, partly covered, e.g., by the field of collective adaptive systems. Therefore, drawing inspiration from and building on the long-time research activity on coordination, multi-agent systems, autonomic/self-* systems, spatial computing, and especially on the recent aggregate computing paradigm, this thesis investigates concepts, methods, and tools for the engineering of possibly large-scale, heterogeneous ensembles of situated components that should be able to operate, adapt and self-organise in a decentralised fashion. The primary contribution of this thesis consists of four main parts. First, we define and implement an aggregate programming language (ScaFi), internal to the mainstream Scala programming language, for describing collective adaptive behaviour, based on field calculi. Second, we conceive of a “dynamic collective computation” abstraction, also called aggregate process, formalised by an extension to the field calculus, and implemented in ScaFi. Third, we characterise and provide a proof-of-concept implementation of a middleware for aggregate computing that enables the development of aggregate systems according to multiple architectural styles. Fourth, we apply and evaluate aggregate computing techniques to edge computing scenarios, and characterise a design pattern, called Self-organising Coordination Regions (SCR), that supports adjustable, decentralised decision-making and activity in dynamic environments.Con lo sviluppo di informatica e intelligenza artificiale, la diffusione pervasiva di device computazionali e la crescente interconnessione tra elementi fisici e digitali, emergono innumerevoli opportunità per la costruzione di sistemi socio-tecnici di nuova generazione. Tuttavia, l'ingegneria di tali sistemi presenta notevoli sfide, data la loro complessità—si pensi ai livelli, scale, eterogeneità, e interdipendenze coinvolti. Oltre a dispositivi smart individuali, collettivi cyber-fisici possono fornire servizi o risolvere problemi complessi con un “effetto sistema” che emerge dalla coordinazione e l'adattamento di componenti fra loro, l'ambiente e il contesto. Comprendere e costruire sistemi in grado di esibire intelligenza collettiva e capacità autonomiche è un importante problema di ricerca studiato, ad esempio, nel campo dei sistemi collettivi adattativi. Perciò, traendo ispirazione e partendo dall'attività di ricerca su coordinazione, sistemi multiagente e self-*, modelli di computazione spazio-temporali e, specialmente, sul recente paradigma di programmazione aggregata, questa tesi tratta concetti, metodi, e strumenti per l'ingegneria di ensemble di elementi situati eterogenei che devono essere in grado di lavorare, adattarsi, e auto-organizzarsi in modo decentralizzato. Il contributo di questa tesi consiste in quattro parti principali. In primo luogo, viene definito e implementato un linguaggio di programmazione aggregata (ScaFi), interno al linguaggio Scala, per descrivere comportamenti collettivi e adattativi secondo l'approccio dei campi computazionali. In secondo luogo, si propone e caratterizza l'astrazione di processo aggregato per rappresentare computazioni collettive dinamiche concorrenti, formalizzata come estensione al field calculus e implementata in ScaFi. Inoltre, si analizza e implementa un prototipo di middleware per sistemi aggregati, in grado di supportare più stili architetturali. Infine, si applicano e valutano tecniche di programmazione aggregata in scenari di edge computing, e si propone un pattern, Self-Organising Coordination Regions, per supportare, in modo decentralizzato, attività decisionali e di regolazione in ambienti dinamici

AMS Tesi di Dottorato

Découverte et allocation des ressources pour le traitement de requêtes dans les systèmes grilles

Author: Cokuslu Deniz
Publication venue
Publication date: 02/11/2012
Field of study

De nos jours, les systèmes Grille, grâce à leur importante capacité de calcul et de stockage ainsi que leur disponibilité, constituent l'un des plus intéressants environnements informatiques. Dans beaucoup de différents domaines, on constate l'utilisation fréquente des facilités que les environnements Grille procurent. Le traitement des requêtes distribuées est l'un de ces domaines où il existe de grandes activités de recherche en cours, pour transférer l'environnement sous-jacent des systèmes distribués et parallèles à l'environnement Grille. Dans le cadre de cette thèse, nous nous concentrons sur la découverte des ressources et des algorithmes d'allocation de ressources pour le traitement des requêtes dans les environnements Grille. Pour ce faire, nous proposons un algorithme de découverte des ressources pour le traitement des requêtes dans les systèmes Grille en introduisant le contrôle de topologie auto-stabilisant et l'algorithme de découverte des ressources dirigé par l'élection convergente. Ensuite, nous présentons un algorithme d'allocation des ressources, qui réalise l'allocation des ressources pour les requêtes d'opérateur de jointure simple par la génération d'un espace de recherche réduit pour les nœuds candidats et en tenant compte des proximités des candidats aux sources de données. Nous présentons également un autre algorithme d'allocation des ressources pour les requêtes d'opérateurs de jointure multiple. Enfin, on propose un algorithme d'allocation de ressources, qui apporte une tolérance aux pannes lors de l'exécution de la requête par l'utilisation de la réplication passive d'opérateurs à état. La contribution générale de cette thèse est double. Premièrement, nous proposons un nouvel algorithme de découverte de ressource en tenant compte des caractéristiques des environnements Grille. Nous nous adressons également aux problèmes d'extensibilité et de dynamicité en construisant une topologie efficace sur l'environnement Grille et en utilisant le concept d'auto-stabilisation, et par la suite nous adressons le problème de l'hétérogénéité en proposant l'algorithme de découverte de ressources dirigé par l'élection convergente. La deuxième contribution de cette thèse est la proposition d'un nouvel algorithme d'allocation des ressources en tenant compte des caractéristiques de l'environnement Grille. Nous abordons les problèmes causés par la grande échelle caractéristique en réduisant l'espace de recherche pour les ressources candidats. De ce fait nous réduisons les coûts de communication au cours de l'exécution de la requête en allouant des nœuds au plus près des sources de données. Et enfin nous traitons la dynamicité des nœuds, du point de vue de leur existence dans le système, en proposant un algorithme d'affectation des ressources avec une tolérance aux pannes.Grid systems are today's one of the most interesting computing environments because of their large computing and storage capabilities and their availability. Many different domains profit the facilities of grid environments. Distributed query processing is one of these domains in which there exists large amounts of ongoing research to port the underlying environment from distributed and parallel systems to the grid environment. In this thesis, we focus on resource discovery and resource allocation algorithms for query processing in grid environments. For this, we propose resource discovery algorithm for query processing in grid systems by introducing self-stabilizing topology control and converge-cast based resource discovery algorithms. Then, we propose a resource allocation algorithm, which realizes allocation of resources for single join operator queries by generating a reduced search space for the candidate nodes and by considering proximities of candidates to the data sources. We also propose another resource allocation algorithm for queries with multiple join operators. Lastly, we propose a fault-tolerant resource allocation algorithm, which provides fault-tolerance during the execution of the query by the use of passive replication of stateful operators. The general contribution of this thesis is twofold. First, we propose a new resource discovery algorithm by considering the characteristics of the grid environments. We address scalability and dynamicity problems by constructing an efficient topology over the grid environment using the self-stabilization concept; and we deal with the heterogeneity problem by proposing the converge-cast based resource discovery algorithm. The second main contribution of this thesis is the proposition of a new resource allocation algorithm considering the characteristics of the grid environment. We tackle the scalability problem by reducing the search space for candidate resources. We decrease the communication costs during the query execution by allocating nodes closer to the data sources. And finally we deal with the dynamicity of nodes, in terms of their existence in the system, by proposing the fault-tolerant resource allocation algorithm

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

Distributed computing practice for large-scale science and engineering applications

Author: Aldinucci
Allan
Altintas
Anderson
Anderson
Andrews
Berman
Bhat
Bogden
Boghosian
Carilli
Chang
Coddington
Cole
Cooke
Cummings
Dean
Deelman
Docan
Docan
Ellis
Frey
Gallichio
Geyer
Goodman
Grimshaw
Jacob
Jha
Karonis
Kepner
Klasky
Lee
Luckow
MacLaren
Manke
Mattson
Mechoso
Nguyen
Oinn
Pallickara
Parashar
Park
Rabhi
Sarmenta
Scheidegger
Scott
Soh
Swendsen
Taylor
Trinder
van der Aalst
Zaroubi
Zhang
Zhang
Publication venue: 'Wiley'
Publication date: 01/01/2013
Field of study

It is generally accepted that the ability to develop large-scale distributed applications has lagged seriously behind other developments in cyberinfrastructure. In this paper, we provide insight into how such applications have been developed and an understanding of why developing applications for distributed infrastructure is hard. Our approach is unique in the sense that it is centered around half a dozen existing scientific applications; we posit that these scientific applications are representative of the characteristics, requirements, as well as the challenges of the bulk of current distributed applications on production cyberinfrastructure (such as the US TeraGrid). We provide a novel and comprehensive analysis of such distributed scientific applications. Specifically, we survey existing models and methods for large-scale distributed applications and identify commonalities, recurring structures, patterns and abstractions. We find that there are many ad hoc solutions employed to develop and execute distributed applications, which result in a lack of generality and the inability of distributed applications to be extensible and independent of infrastructure details. In our analysis, we introduce the notion of application vectors: a novel way of understanding the structure of distributed applications. Important contributions of this paper include identifying patterns that are derived from a wide range of real distributed applications, as well as an integrated approach to analyzing applications, programming systems and patterns, resulting in the ability to provide a critical assessment of the current practice of developing, deploying and executing distributed applications. Gaps and omissions in the state of the art are identified, and directions for future research are outlined

Crossref

Online Research @ Cardiff

Edinburgh Research Explorer

Self-management for large-scale distributed systems

Author: Al-Shishtawy Ahmad
Publication venue: 'Breakthrough Institute, Rockefeller Philanthropy Advisors'
Publication date: 01/01/2012
Field of study

Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

Publikationer från KTH

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Engineering of IT Management Automation along Task Analysis, Loops, Function Allocation, Machine Capabilities

Author: König Ralf
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 26/04/2010
Field of study

Digitale Hochschulschriften der LMU

Autonomous and Autonomic Systems With Applications to NASA Intelligent Spacecraft Operations and Exploration Systems

Author: Hallock H.L.
Hinchey M.G.
Karlin J.
Rash J.
Rouff C.
Sterritt Roy
Truszkowski W.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 04/12/2009
Field of study

Ulster University's Research Portal

StarMX: A Framework for Developing Self-Managing Software Systems

Author: Asadollahi Reza
Publication venue: 'University of Waterloo'
Publication date: 01/01/2009
Field of study

The scale of computing systems has extensively grown over the past few decades in order to satisfy emerging business requirements. As a result of this evolution, the complexity of these systems has increased significantly, which has led to many difficulties in managing and administering them. The solution to this problem is to build systems that are capable of managing themselves, given high-level objectives. This vision is also known as Autonomic Computing. A self-managing system is governed by a closed control loop, which is responsible for dynamically monitoring the underlying system, analyzing the observed situation, planning the recovering actions, and executing the plan to maintain the system equilibrium. The realization of such systems poses several developmental and operational challenges, including: developing their architecture, constructing the control loop, and creating services that enable dynamic adaptation behavior. Software frameworks are effective in addressing these challenges: they can simplify the development of such systems by reducing design and implementation efforts, and they provide runtime services for supporting self-managing behavior. This dissertation presents a novel software framework, called StarMX, for developing adaptive and self-managing Java-based systems. It is a generic configurable framework based on standards and well-established principles, and provides the required features and facilities for the development of such systems. It extensively supports Java Management Extensions (JMX) and is capable of integrating with different policy engines. This allows the developer to incorporate and use these techniques in the design of a control loop in a flexible manner. The control loop is created as a chain of entities, called processes, such that each process represents one or more functions of the loop (monitoring, analyzing, planning, and executing). A process is implemented by either a policy language or the Java language. At runtime, the framework invokes the chain of processes in the control loop, providing each one with the required set of objects for monitoring and effecting. An open source Java-based Voice over IP system, called CC2, is selected as the case study used in a set of experiments that aim to capture a solid understanding of the framework suitability for developing adaptive systems and to improve its feature set. The experiments are also used to evaluate the performance overhead incurred by the framework at runtime. The performance analysis results show the execution time spent in different components, including the framework itself, the policy engine, and the sensors/effectors. The results also reveal that the time spent in the framework is negligible, and it has no considerable impact on the system's overall performance

University of Waterloo's Institutional Repository