404 research outputs found

    Dimensioning backbone networks for multi-site data centers: exploiting anycast routing for resilience

    Get PDF
    In the current era of big data, applications increasingly rely on powerful computing infrastructure residing in large data centers (DCs), often adopting cloud computing technology. Clearly, this necessitates efficient and resilient networking infrastructure to connect the users of these applications with the data centers hosting them. In this paper, we focus on backbone network infrastructure on large geographical scales (i.e., the so-called wide area networks), which typically adopts optical network technology. In particular, we study the problem of dimensioning such backbone networks: what bandwidth should each of the links provide for the traffic, originating at known sources, to reach the data centers? And possibly even: how many such DCs should we deploy, and at what locations? More concretely, we summarize our recent work that essentially addresses the following fundamental research questions: (1) Does the anycast routing strategy influence the amount of required network resources? (2) Can we exploit anycast routing for resilience purposes, i.e., relocate to a different DC under failure conditions, to reduce resource capacity requirements? (3) Is it advantageous to change anycast request destinations from one DC location to the other, from one time period to the next, if service requests vary over time

    Exploring the benefit of rerouting multi-period traffic to multi-site data centers

    Get PDF
    In cloud-like scenarios, demand is served at one of multiple possible data center (DC) destinations. Usually, the exact DC that is used can be freely chosen, which leads to an anycast routing problem. Furthermore, the demand volume is expected to change over time, e.g., following a diurnal pattern. Given that virtually all application domains today rely heavily on cloud-like services, it is important that the backbone networks connecting users to the DCs are resilient against failures. In this paper, we consider the problem of resiliently routing multi-period traffic: we need to find routes to both a primary DC and a backup DC (to be used in the case of failure of the primary one, or of the network connection to it), and also account for synchronization traffic between the primary and backup DCs. We formulate this as an optimization problem and adopt column generation, using a path formulation in two sub-problems: the (restricted) master problem selects "configurations" to use for each demand in each of the time epochs it lasts, while the pricing problem (PP) constructs a new "configuration" that can lead to lower overall costs (which we express as the number of network resources, i.e., bandwidth, required to serve the demand). Here, a "configuration" is defined by the network paths followed from the demand source to each of the two selected DCs, as well as that of the synchronization traffic in between the DCs. Our decomposition allows for PPs to be solved in parallel, for which we quantitatively explore the reduction in the time required to solve the overall routing problem. The key question that we address with our model is an exploration of the potential benefits of rerouting traffic from one time epoch to the next: we compare several (re) routing strategies, allowing traffic that spans multiple time periods to i) not be rerouted in different periods, ii) only change the backup DC and routes, or iii) freely change both primary and backup DC choices and the routes toward them

    Resilience options for provisioning anycast cloud services with virtual optical networks

    Get PDF
    Optical networks are crucial to support increasingly demanding cloud services. Delivering the requested quality of services (in particular latency) is key to successfully provisioning end-to-end services in clouds. Therefore, as for traditional optical network services, it is of utter importance to guarantee that clouds are resilient to any failure of either network infrastructure (links and/or nodes) or data centers. A crucial concept in establishing cloud services is that of network virtualization: the physical infrastructure is logically partitioned in separate virtual networks. To guarantee end-to-end resilience for cloud services in such a set-up, we need to simultaneously route the services and map the virtual network, in such a way that an alternate routing in case of physical resource failures is always available. Note that combined control of the network and data center resources is exploited, and the anycast routing concept applies: we can choose the data center to provide server resources requested by the customer to optimize resource usage and/or resiliency. This paper investigates the design of scalable optimization models to perform the virtual network mapping resiliently. We compare various resilience options, and analyze their compromise between bandwidth requirements and resiliency quality

    Selecting the best locations for data centers in resilient optical grid/cloud dimensioning

    Get PDF
    For optical grid/cloud scenarios, the dimensioning problem comprises not only deciding on the network dimensions (i.e., link bandwidths), but also choosing appropriate locations to install server infrastructure (i.e., data centers), as well as determining the amount of required server resources (for storage and/or processing). Given that users of such grid/cloud systems in general do not care about the exact physical locations of the server resources, a degree of freedom arises in choosing for each of their requests the most appropriate server location. We will exploit this anycast routing principle (i.e., source of traffic is given, but destination can be chosen rather freely) also to provide resilience: traffic may be relocated to alternate destinations in case of network/server failures. In this study, we propose to jointly optimize the link dimensioning and the location of the servers in an optical grid/cloud, where the anycast principle is applied for resiliency against either link or server node failures. While the data center location problem has some resemblance with either the classical p-center or k-means location problems, the anycast principle makes it much more difficult due to the requirement of link disjoint paths for ensuring grid resiliency

    Virtual-Mobile-Core Placement for Metro Network

    Full text link
    Traditional highly-centralized mobile core networks (e.g., Evolved Packet Core (EPC)) need to be constantly upgraded both in their network functions and backhaul links, to meet increasing traffic demands. Network Function Virtualization (NFV) is being investigated as a potential cost-effective solution for this upgrade. A virtual mobile core (here, virtual EPC, vEPC) provides deployment flexibility and scalability while reducing costs, network-resource consumption and application delay. Moreover, a distributed deployment of vEPC is essential for emerging paradigms like Multi-Access Edge Computing (MEC). In this work, we show that significant reduction in networkresource consumption can be achieved as a result of optimal placement of vEPC functions in metro area. Further, we show that not all vEPC functions need to be distributed. In our study, for the first time, we account for vEPC interactions in both data and control planes (Non-Access Stratum (NAS) signaling procedure Service Chains (SCs) with application latency requirements) using a detailed mathematical model

    Service Chain (SC) Mapping with Multiple SC Instances in a Wide Area Network

    Full text link
    Network Function Virtualization (NFV) aims to simplify deployment of network services by running Virtual Network Functions (VNFs) on commercial off-the-shelf servers. Service deployment involves placement of VNFs and in-sequence routing of traffic flows through VNFs comprising a Service Chain (SC). The joint VNF placement and traffic routing is usually referred as SC mapping. In a Wide Area Network (WAN), a situation may arise where several traffic flows, generated by many distributed node pairs, require the same SC, one single instance (or occurrence) of that SC might not be enough. SC mapping with multiple SC instances for the same SC turns out to be a very complex problem, since the sequential traversal of VNFs has to be maintained while accounting for traffic flows in various directions. Our study is the first to deal with SC mapping with multiple SC instances to minimize network resource consumption. Exact mathematical modeling of this problem results in a quadratic formulation. We propose a two-phase column-generation-based model and solution in order to get results over large network topologies within reasonable computational times. Using such an approach, we observe that an appropriate choice of only a small set of SC instances can lead to solution very close to the minimum bandwidth consumption

    High-Resolution Road Vehicle Collision Prediction for the City of Montreal

    Full text link
    Road accidents are an important issue of our modern societies, responsible for millions of deaths and injuries every year in the world. In Quebec only, in 2018, road accidents are responsible for 359 deaths and 33 thousands of injuries. In this paper, we show how one can leverage open datasets of a city like Montreal, Canada, to create high-resolution accident prediction models, using big data analytics. Compared to other studies in road accident prediction, we have a much higher prediction resolution, i.e., our models predict the occurrence of an accident within an hour, on road segments defined by intersections. Such models could be used in the context of road accident prevention, but also to identify key factors that can lead to a road accident, and consequently, help elaborate new policies. We tested various machine learning methods to deal with the severe class imbalance inherent to accident prediction problems. In particular, we implemented the Balanced Random Forest algorithm, a variant of the Random Forest machine learning algorithm in Apache Spark. Interestingly, we found that in our case, Balanced Random Forest does not perform significantly better than Random Forest. Experimental results show that 85% of road vehicle collisions are detected by our model with a false positive rate of 13%. The examples identified as positive are likely to correspond to high-risk situations. In addition, we identify the most important predictors of vehicle collisions for the area of Montreal: the count of accidents on the same road segment during previous years, the temperature, the day of the year, the hour and the visibility

    Joint dimensioning of server and network infrastructure for resilient optical grids/clouds

    Get PDF
    We address the dimensioning of infrastructure, comprising both network and server resources, for large-scale decentralized distributed systems such as grids or clouds. We design the resulting grid/cloud to be resilient against network link or server failures. To this end, we exploit relocation: Under failure conditions, a grid job or cloud virtual machine may be served at an alternate destination (i.e., different from the one under failure-free conditions). We thus consider grid/cloud requests to have a known origin, but assume a degree of freedom as to where they end up being served, which is the case for grid applications of the bag-of-tasks (BoT) type or hosted virtual machines in the cloud case. We present a generic methodology based on integer linear programming (ILP) that: 1) chooses a given number of sites in a given network topology where to install server infrastructure; and 2) determines the amount of both network and server capacity to cater for both the failure-free scenario and failures of links or nodes. For the latter, we consider either failure-independent (FID) or failure-dependent (FD) recovery. Case studies on European-scale networks show that relocation allows considerable reduction of the total amount of network and server resources, especially in sparse topologies and for higher numbers of server sites. Adopting a failure-dependent backup routing strategy does lead to lower resource dimensions, but only when we adopt relocation (especially for a high number of server sites): Without exploiting relocation, potential savings of FD versus FID are not meaningful
    • …
    corecore