1,642 research outputs found

    On distributed scheduling in wireless networks exploiting broadcast and network coding

    Get PDF
    In this paper, we consider cross-layer optimization in wireless networks with wireless broadcast advantage, focusing on the problem of distributed scheduling of broadcast links. The wireless broadcast advantage is most useful in multicast scenarios. As such, we include network coding in our design to exploit the throughput gain brought in by network coding for multicasting. We derive a subgradient algorithm for joint rate control, network coding and scheduling, which however requires centralized link scheduling. Under the primary interference model, link scheduling problem is equivalent to a maximum weighted hypergraph matching problem that is NP-complete. To solve the scheduling problem distributedly, locally greedy and randomized approximation algorithms are proposed and shown to have bounded worst-case performance. With random network coding, we obtain a fully distributed cross-layer design. Numerical results show promising throughput gain using the proposed algorithms, and surprisingly, in some cases even with less complexity than cross-layer design without broadcast advantage

    Optimal Networks from Error Correcting Codes

    Full text link
    To address growth challenges facing large Data Centers and supercomputing clusters a new construction is presented for scalable, high throughput, low latency networks. The resulting networks require 1.5-5 times fewer switches, 2-6 times fewer cables, have 1.2-2 times lower latency and correspondingly lower congestion and packet losses than the best present or proposed networks providing the same number of ports at the same total bisection. These advantage ratios increase with network size. The key new ingredient is the exact equivalence discovered between the problem of maximizing network bisection for large classes of practically interesting Cayley graphs and the problem of maximizing codeword distance for linear error correcting codes. Resulting translation recipe converts existent optimal error correcting codes into optimal throughput networks.Comment: 14 pages, accepted at ANCS 2013 conferenc

    An efficient task-based all-reduce for machine learning applications

    Get PDF
    All-Reduce is a collective-combine operation frequently utilised in synchronous parameter updates in parallel machine learning algorithms. The performance of this operation - and subsequently of the algorithm itself - is heavily dependent on its implementation, configuration and on the supporting hardware on which it is run. Given the pivotal role of all-reduce, a failure in any of these regards will significantly impact the resulting scientific output. In this research we explore the performance of alternative all-reduce algorithms in data-flow graphs and compare these to the commonly used reduce-broadcast approach. We present an architecture and interface for all-reduce in task-based frameworks, and a parallelization scheme for object-serialization and computation. We present a concrete, novel application of a butterfly all-reduce algorithm on the Apache Spark framework on a high-performance compute cluster, and demonstrate the effectiveness of the new butterfly algorithm with a logarithmic speed-up with respect to the vector length compared with the original reduce-broadcast method - a 9x speed-up is observed for vector lengths in the order of 108. This improvement is comprised of both algorithmic changes (65%) and parallel-processing optimization (35%). The effectiveness of the new butterfly all-reduce is demonstrated using real-world neural network applications with the Spark framework. For the model-update operation we observe significant speed-ups using the new butterfly algorithm compared with the original reduce-broadcast, for both smaller (Cifar and Mnist) and larger (ImageNet) datasets

    Building Fault Tollrence within Clouds at Network Level

    Get PDF
    Cloud computing technologies and infrastructure facilities are coming up in a big way making it cost effective for the users to implement their IT based solutions to run business in most cost-effective and economical way. Many intricate issues however, have cropped-up which must be addressed to be able to use clouds the purpose for which they are designed and implemented. Among all, fault tolerance and securing the data stored on the clouds takes most of the importance. Continuous availability of the services is dependent on many factors. Faults bound to happen within a network, software, and platform or within the infrastructure which are all used for establishing the cloud. The network that connects various servers, devices, peripherals etc., have to be fault tolerant to start-with so that intended and un-interrupted services to the user can be made available. A novel network design method that leads to achieve high availability of the network and thereby the cloud itself has been presented in this pape

    Wireless Inter-Session Network Coding - An Approach Using Virtual Multicasts

    Get PDF
    This paper addresses the problem of inter-session network coding to maximize throughput for multiple communication sessions in wireless networks. We introduce virtual multicast connections which can extract packets from original sessions and code them together. Random linear network codes can be used for these virtual multicasts. The problem can be stated as a flow-based convex optimization problem with side constraints. The proposed formulation provides a rate region which is at least as large as the region without inter-session network coding. We show the benefits of our technique for several scenarios by means of simulation.United States. Defense Advanced Research Projects Agency (Subcontract 18870740-37362-C

    Optimal performance of distributed simulation programs

    Get PDF
    Journal ArticleThis paper describes a technique to analyze the potential speedup of distributed simulation programs. A distributed simulation strategy is proposed which minimizes execution time through the use of an oracle to control the simulation. Because the strategy relies on an oracle, it cannot be used for practical simulations. However the strategy facilitates performance evaluations of distributed simulation strategies by providing a useful point of comparison and can be used to determine the suitability of specific applications for implementation on a parallel computer. Based on the proposed strategy, a tool has been developed to determine the maximum performance which can be achieved from a distributed simulation program. In this paper we describe the technique and its use in evaluating the parallelism available in distributed simulators of parallel computer systems

    A Linear Network Code Construction for General Integer Connections Based on the Constraint Satisfaction Problem

    Get PDF
    The problem of finding network codes for general connections is inherently difficult in capacity constrained networks. Resource minimization for general connections with network coding is further complicated. Existing methods for identifying solutions mainly rely on highly restricted classes of network codes, and are almost all centralized. In this paper, we introduce linear network mixing coefficients for code constructions of general connections that generalize random linear network coding (RLNC) for multicast connections. For such code constructions, we pose the problem of cost minimization for the subgraph involved in the coding solution and relate this minimization to a path-based Constraint Satisfaction Problem (CSP) and an edge-based CSP. While CSPs are NP-complete in general, we present a path-based probabilistic distributed algorithm and an edge-based probabilistic distributed algorithm with almost sure convergence in finite time by applying Communication Free Learning (CFL). Our approach allows fairly general coding across flows, guarantees no greater cost than routing, and shows a possible distributed implementation. Numerical results illustrate the performance improvement of our approach over existing methods.Comment: submitted to TON (conference version published at IEEE GLOBECOM 2015

    Broadcasting in Hyper-cylinder graphs

    Get PDF
    Broadcasting in computer networking means the dissemination of information, which is known initially only at some nodes, to all network members. The goal is to inform every node in the minimal time possible. There are few models for broadcasting; the simplest and the historical model is called the Classical model. In the Classical model, dissemination happens in synchronous rounds, wherein a node may only inform one of its neighbors. The broadcast question is: What is the minimum number of rounds needed for broadcasting, and what broadcast scheme achieves it? For general graphs, these questions are NP-hard, and it is known to be at least 3 - ε inapproximable for any real ε > 0. Even for some very restricted classes of graphs, the questions remain as an NP-hard problem. Little is known about broadcasting in restricted graphs, and only a few classes have a polynomial solution. Parallel and distributed computing is one of the important domains which relies on efficient broadcasting. Hypercube and torus are the most used network topology in this domain. The widespread use is not only due to their simplicity but also is for their efficiency and high robustness (e.g., fault tolerance) while having an acceptable number of links. In this thesis, it is observed that the Cartesian product of a number of path and cycle graphs produces a valuable set of topologies, we called hyper-cylinders, which contain hypercube and Torus as well. Any hyper-cylinder shares many of the beneficial features of hypercube and torus and might be a suitable substitution in some cases. Some hyper-cylinders are also similar to other practically used topologies such as cube-connected cycles. In this thesis, the effect of the Cartesian product on broadcasting and broadcasting of hyper-cylinders under the Classical and Messy models is studied. This will add a valuable class of graphs to the limited classes of graphs which have a polynomially computable broadcast time. In the end, the relation between worst-case originators and diameters in trees is studied, which may help in the broadcast study of a larger class of graphs where any tree is allowed instead of a path in the Cartesian product
    corecore