75 research outputs found

    Compact Oblivious Routing

    Get PDF
    Oblivious routing is an attractive paradigm for large distributed systems in which centralized control and frequent reconfigurations are infeasible or undesired (e.g., costly). Over the last almost 20 years, much progress has been made in devising oblivious routing schemes that guarantee close to optimal load and also algorithms for constructing such schemes efficiently have been designed. However, a common drawback of existing oblivious routing schemes is that they are not compact: they require large routing tables (of polynomial size), which does not scale. This paper presents the first oblivious routing scheme which guarantees close to optimal load and is compact at the same time - requiring routing tables of polylogarithmic size. Our algorithm maintains the polylogarithmic competitive ratio of existing algorithms, and is hence particularly well-suited for emerging large-scale networks

    Measuring and Understanding Throughput of Network Topologies

    Full text link
    High throughput is of particular interest in data center and HPC networks. Although myriad network topologies have been proposed, a broad head-to-head comparison across topologies and across traffic patterns is absent, and the right way to compare worst-case throughput performance is a subtle problem. In this paper, we develop a framework to benchmark the throughput of network topologies, using a two-pronged approach. First, we study performance on a variety of synthetic and experimentally-measured traffic matrices (TMs). Second, we show how to measure worst-case throughput by generating a near-worst-case TM for any given topology. We apply the framework to study the performance of these TMs in a wide range of network topologies, revealing insights into the performance of topologies with scaling, robustness of performance across TMs, and the effect of scattered workload placement. Our evaluation code is freely available

    Towards a better approximation for sparsest cut?

    Full text link
    We give a new (1+ϵ)(1+\epsilon)-approximation for sparsest cut problem on graphs where small sets expand significantly more than the sparsest cut (sets of size n/rn/r expand by a factor lognlogr\sqrt{\log n\log r} bigger, for some small rr; this condition holds for many natural graph families). We give two different algorithms. One involves Guruswami-Sinop rounding on the level-rr Lasserre relaxation. The other is combinatorial and involves a new notion called {\em Small Set Expander Flows} (inspired by the {\em expander flows} of ARV) which we show exists in the input graph. Both algorithms run in time 2O(r)poly(n)2^{O(r)} \mathrm{poly}(n). We also show similar approximation algorithms in graphs with genus gg with an analogous local expansion condition. This is the first algorithm we know of that achieves (1+ϵ)(1+\epsilon)-approximation on such general family of graphs

    Sparsest Cut on Bounded Treewidth Graphs: Algorithms and Hardness Results

    Full text link
    We give a 2-approximation algorithm for Non-Uniform Sparsest Cut that runs in time nO(k)n^{O(k)}, where kk is the treewidth of the graph. This improves on the previous 22k2^{2^k}-approximation in time \poly(n) 2^{O(k)} due to Chlamt\'a\v{c} et al. To complement this algorithm, we show the following hardness results: If the Non-Uniform Sparsest Cut problem has a ρ\rho-approximation for series-parallel graphs (where ρ1\rho \geq 1), then the Max Cut problem has an algorithm with approximation factor arbitrarily close to 1/ρ1/\rho. Hence, even for such restricted graphs (which have treewidth 2), the Sparsest Cut problem is NP-hard to approximate better than 17/16ϵ17/16 - \epsilon for ϵ>0\epsilon > 0; assuming the Unique Games Conjecture the hardness becomes 1/αGWϵ1/\alpha_{GW} - \epsilon. For graphs with large (but constant) treewidth, we show a hardness result of 2ϵ2 - \epsilon assuming the Unique Games Conjecture. Our algorithm rounds a linear program based on (a subset of) the Sherali-Adams lift of the standard Sparsest Cut LP. We show that even for treewidth-2 graphs, the LP has an integrality gap close to 2 even after polynomially many rounds of Sherali-Adams. Hence our approach cannot be improved even on such restricted graphs without using a stronger relaxation

    Models and Algorithms for Robust Network Design with Several Traffic Scenarios

    Get PDF
    We consider a robust network design problem in which optimum integral capacities need to be installed on the edges of a network such that the supplies and demands in each of the explicitly known traffic scenarios are satisfied by a single-commodity flow. In Buchheim et al. (LNCS 6701, 7 - 17 (2011)), an integer-programming (IP) formulation of polynomial size was given that uses both flow and capacity variables. In this work, we introduce an IP formulation that only uses capacity variables and exponentially many constraints that can be separated in polynomial time. We argue that the latter formulation has advantageous features when used within branch and cut and evaluate preliminary computational results for the bounds in the root node. We introduce a class of instances that is difficult for IP-based solution approaches. We design and implement a heuristic solution approach based on the definition and exploration of large neighborhoods of carefully selected size. The performance of the heuristic is evaluated on the difficult class of instances. The results are encouraging, with a good understanding of the trade-off between solution quality and neighborhood size

    Compact Oblivious Routing in Weighted Graphs

    Get PDF

    Efficient All-to-All Collective Communication Schedules for Direct-Connect Topologies

    Full text link
    The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This is mainly because of the quadratic scaling in the number of messages that must be simultaneously serviced combined with large message sizes. This paper takes a holistic approach to optimize the performance of all-to-all collective communications on supercomputer-scale direct-connect interconnects. We address several algorithmic and practical challenges in developing efficient and bandwidth-optimal all-to-all schedules for any topology, lowering the schedules to various backends and fabrics that may or may not expose additional forwarding bandwidth, establishing an upper bound on all-to-all throughput, and exploring novel topologies that deliver near-optimal all-to-all performance
    corecore