444 research outputs found

    Resilient network dimensioning for optical grid/clouds using relocation

    Get PDF
    In this paper we address the problem of dimensioning infrastructure, comprising both network and server resources, for large-scale decentralized distributed systems such as grids or clouds. We will provide an overview of our work in this area, and in particular focus on how to design the resulting grid/cloud to be resilient against network link and/or server site failures. To this end, we will exploit relocation: under failure conditions, a request may be sent to an alternate destination than the one under failure-free conditions. We will provide a comprehensive overview of related work in this area, and focus in some detail on our own most recent work. The latter comprises a case study where traffic has a known origin, but we assume a degree of freedom as to where its end up being processed, which is typically the case for e. g., grid applications of the bag-of-tasks (BoT) type or for providing cloud services. In particular, we will provide in this paper a new integer linear programming (ILP) formulation to solve the resilient grid/cloud dimensioning problem using failure-dependent backup routes. Our algorithm will simultaneously decide on server and network capacity. We find that in the anycast routing problem we address, the benefit of using failure-dependent (FD) rerouting is limited compared to failure-independent (FID) backup routing. We confirm our earlier findings in terms of network capacity savings achieved by relocation compared to not exploiting relocation (order of 6-10% in the current case studies)

    Spare capacity allocation using shared backup path protection for dual link failures

    Get PDF
    This paper extends the spare capacity allocation (SCA) problem from single link failure [1] to dual link failures on mesh-like IP or WDM networks. The SCA problem pre-plans traffic flows with mutually disjoint one working and two backup paths using the shared backup path protection (SBPP) scheme. The aggregated spare provision matrix (SPM) is used to capture the spare capacity sharing for dual link failures. Comparing to a previous work by He and Somani [2], this method has better scalability and flexibility. The SCA problem is formulated in a non-linear integer programming model and partitioned into two sequential linear sub-models: one finds all primary backup paths first, and the other finds all secondary backup paths next. The results on five networks show that the network redundancy using dedicated 1+1+1 is in the range of 313-400%. It drops to 96-181% in 1:1:1 without loss of dual-link resiliency, but with the trade-off of using the complicated share capacity sharing among backup paths. The hybrid 1+1:1 provides intermediate redundancy ratio at 187-310% with a moderate complexity. We also compare the passive/active approaches which consider spare capacity sharing after/during the backup path routing process. The active sharing approaches always achieve lower redundancy values than the passive ones. These reduction percentages are about 12% for 1+1:1 and 25% for 1:1:1 respectively

    Spare capacity allocation using partially disjoint paths for dual link failure protection

    Get PDF
    A shared backup path protection (SBPP) scheme can be used to protect dual link failures by pre-planning each traffic flow with mutually disjoint working and two backup paths while minimizing the network overbuild. However, many existing backbone networks are bi-connected without three fully disjoint paths between all node pairs. Hence in practice partially disjoint paths (PDP) have been used for backup paths instead of fully disjoint ones. This paper studies the minimum spare capacity allocation (SCA) problem using PDP within an optimization framework. This is an extension of the spare provision matrix (SPM) method for PDP. The integer linear programming (ILP) model is formulated and an approximation algorithm, Successive Survivable Routing (SSR), is extended and used in the numerical study. © 2013 Scientific Assoc for infocom

    Joint dimensioning of server and network infrastructure for resilient optical grids/clouds

    Get PDF
    We address the dimensioning of infrastructure, comprising both network and server resources, for large-scale decentralized distributed systems such as grids or clouds. We design the resulting grid/cloud to be resilient against network link or server failures. To this end, we exploit relocation: Under failure conditions, a grid job or cloud virtual machine may be served at an alternate destination (i.e., different from the one under failure-free conditions). We thus consider grid/cloud requests to have a known origin, but assume a degree of freedom as to where they end up being served, which is the case for grid applications of the bag-of-tasks (BoT) type or hosted virtual machines in the cloud case. We present a generic methodology based on integer linear programming (ILP) that: 1) chooses a given number of sites in a given network topology where to install server infrastructure; and 2) determines the amount of both network and server capacity to cater for both the failure-free scenario and failures of links or nodes. For the latter, we consider either failure-independent (FID) or failure-dependent (FD) recovery. Case studies on European-scale networks show that relocation allows considerable reduction of the total amount of network and server resources, especially in sparse topologies and for higher numbers of server sites. Adopting a failure-dependent backup routing strategy does lead to lower resource dimensions, but only when we adopt relocation (especially for a high number of server sites): Without exploiting relocation, potential savings of FD versus FID are not meaningful

    A robust optimization approach to backup network design with random failures

    Get PDF
    This paper presents a scheme in which a dedicated backup network is designed to provide protection from random link failures. Upon a link failure in the primary network, traffic is rerouted through a preplanned path in the backup network. We introduce a novel approach for dealing with random link failures, in which probabilistic survivability guarantees are provided to limit capacity over-provisioning. We show that the optimal backup routing strategy in this respect depends on the reliability of the primary network. Specifically, as primary links become less likely to fail, the optimal backup networks employ more resource sharing amongst backup paths. We apply results from the field of robust optimization to formulate an ILP for the design and capacity provisioning of these backup networks. We then propose a simulated annealing heuristic to solve this problem for largescale networks, and present simulation results that verify our analysis and approach.National Science Foundation (U.S.) (grant CNS-0626781)National Science Foundation (U.S.) (grant CNS-0830961)United States. Defense Threat Reduction Agency (grant HDTRA1-07-1-0004)United States. Defense Threat Reduction Agency (grant HDTRA-09-1-005

    Efficient Protection of Many-to-One Communications

    Get PDF
    International audienceThe dependability of a network is its ability to cope with failures , i.e., to maintain established connections even in case of failures. IP routing protocols (such as OSPF and RIP) do not fit the dependability objectives of today applications. Moreover, forwarding techniques based on destination address (like IP) induce many-to-one connections. If a dependable connection is needed, all primary paths and protections having the same destination must be established in a coordinated way. In this paper, we propose a fault recovery for many-to-one connections based on a cold (preplanned) protection. The main advantage of our approach is that the recovery in case of failures is achieved within a short delay. Additionally, with respect to other approaches, the dependability of the routing scheme is increased in the way that it statistically copes with many failures. The algorithm we propose computes an efficient backup for an arbitrary primary tree using an improved multi-tree algorithm

    Fundamental schemes to determine disjoint paths for multiple failure scenarios

    Get PDF
    Disjoint path routing approaches can be used to cope with multiple failure cenarios. This can be achieved using a set of k (k>2) link- (or node-) disjoint path pairs (in single-cost and multi-cost networks). Alternatively, if Shared Risk Link Groups (SRLGs) information is available, the calculation of an SRLG-disjoint path pair (or of a set of such paths) can protect a connection against the joint failure of the set of links in any single SRLG. Paths traversing disaster-prone regions should be disjoint, but in safe regions it may be acceptable for the paths to share links or even nodes for a quicker recovery. Auxiliary algorithms for obtaining the shortest path from a source to a destination are also presented in detail, followed by the illustrated description of Bhandari’s and Suurballe’s algorithms for obtaining a pair of paths of minimal total additive cost. These algorithms are instrumental for some of the presented schemes to determine disjoint paths for multiple failure scenarios.info:eu-repo/semantics/publishedVersio
    • …
    corecore