2,687 research outputs found

    Reliability of Heterogeneous Distributed Computing Systems in the Presence of Correlated Failures

    Get PDF
    While the reliability of distributed-computing systems (DCSs) has been widely studied under the assumption that computing elements (CEs) fail independently, the impact of correlated failures of CEs on the reliability remains an open question. Here, the problem of modeling and assessing the impact of stochastic, correlated failures on the service reliability of applications running on DCSs is tackled. The service reliability is modeled using an integrated analytical and Monte-Carlo (MC) approach. The analytical component of the model comprises a generalization of a previously developed model for reliability of non-Markovian DCSs to a setting where specific patterns of simultaneous failures in CEs are allowed. The analytical model is complemented by a MC-based procedure to draw correlated-failure patterns using the recently reported concept of probabilistic shared risk groups (PSRGs). The reliability model is further utilized to develop and optimize a novel class of dynamic task reallocation (DTR) policies that maximize the reliability of DCSs in the presence of correlated failures. Theoretical predictions, MC simulations, and results from an emulation testbed show that the reliability can be improved when DTR policies correctly account for correlated failures. The impact of correlated failures of CEs on the reliability and the key dependence of DTR policies on the type of correlated failures are also investigated

    An Agent Based Transaction Manager for Multidatabase Systems

    Get PDF
    A multidatabase system (MDBMS) is a facility that allows users to access data located in multiple autonomous database management systems (DBMSs) at different sites. To ensure global atomicity for multidatabase transactions, a reliable global atomic commitment protocol is a possible solution. In this protocol a centralized transaction manager (TM) receives global transactions, submits subtransactions to the appropriate sites via AGENTS. An AGENT is a component of MDBS that runs on each site; AGENTS after receiving subtransactions from the transaction manager perform the transaction and send the results back to TM. We have presented a unique proof-of-concept, a JAVA application for an Agent Based Transaction Manager that preserves global atomicity. It provides a user friendly interface through which reliable atomic commitment protocol for global transaction execution in multidatabase environment can be visualized. We demonstrated with three different test case scenarios how the protocol works. This is useful in further research in this area where atomicity of transactions can be verified for protocol correctness

    Dynamic Change of Server Assignments in Distributed Workflow Management Systems

    Get PDF
    Workow management systems (WfMS) offer a promising approach for realizing process-oriented information systems. Central WfMS, with a single server controlling all workow (WF) instances, however, may become overloaded very soon. In the literature, therefore, many approaches suggest using a multi-server WfMS with distributed WF control. In such a distributed WfMS, the concrete WF server for the control of a particular WF activity is usually dened by an associated server assignment. Following such a partitioning approach, problems may occur if components (WF servers, subnets, or gateways) become overloaded or break down. As we know from other elds of computer science, a favorable approach to handle such cases may be to dynamically change hardware assignment. This corresponds to the dynamic change of server assignments in WfMS. This paper analyses to what extend this approach is reasonable in such situations

    Adaptable Mobile Transactions and Environment Awareness

    Get PDF
    National audienceMobile environments are characterized by high variability (e.g. variable bandwidth, disconnections, different communication prices) as well as by limited mobile host resources. Such characteristics lead to high rates of transaction failures and unpredictable execution costs. This paper introduces an Adaptable Mobile Transaction model (AMT) that allows defining transactions with several execution alternatives associated to a particular context. The principal goal is to adapt transaction execution to context variations. An analytical study shows that using AMTs increases commit probabilities and that it is possible to choose the way transactions will be executed according to their costs. In addition, the middleware TransMobi is proposed. It manages environment awareness and implements the AMT model with suitable protocols.Les environnements mobiles sont caractérisés par une grande variabilité (bande passante variable, déconnexions, prix de communication différents, etc.) ainsi que par des uni-tés mobiles à ressources limitées. Ces caractéristiques entraînent un nombre important de défaillances transactionnels et des coûts d'exécution imprévus. Cet article introduit un modèle de transactions mobiles adaptables (AMT) permettant de définir des transactions avec plusieurs alternatives d'exécution. Le principal objectif est d'adapter l'exécution des transactions aux variations du contexte. Une étude analytique montre que les AMT augmentent la probabilité de validation et qu'il est possible de choisir le type d'exécution en fonction de son coût. Nous proposons également l'intergiciel TransMobi gérant la perception de l'environnement et implantant le modèle AMT à l'aide de protocoles appropriés

    DeepPR: Progressive Recovery for Interdependent VNFs with Deep Reinforcement Learning

    Get PDF
    The increasing reliance upon cloud services entails more flexible networks that are realized by virtualized network equipment and functions. When such advanced network systems face a massive failure by natural disasters or attacks, the recovery of the entire system may be conducted in a progressive way due to limited repair resources. The prioritization of network equipment in the recovery phase influences the interim computation and communication capability of systems, since the systems are operated under partial functionality. Hence, finding the best recovery order is a critical problem, which is further complicated by virtualization due to dependency among network nodes and layers. This paper deals with a progressive recovery problem under limited resources in networks with VNFs, where some dependent network layers exist. We prove the NP-hardness of the progressive recovery problem and approach the optimum solution by introducing DeepPR, a progressive recovery technique based on Deep Reinforcement Learning (Deep RL). Our simulation results indicate that DeepPR can achieve the near-optimal solutions in certain networks and is more robust to adversarial failures, compared to a baseline heuristic algorithm.Comment: Technical Report, 12 page
    • …
    corecore