23,065 research outputs found

    On Comparing the Performance of Dynamic Multi-Network Optimizations

    Get PDF
    Abstract-With a large variety of wireless access technologies available, multi-homed devices may strongly improve the performance and reliability of communication when using multiple networks simultaneously. A key question for the practical application of multi-path strategies is the granularity at which the traffic streams should be dispersed among the available networks. This level of granularity may be expected to have a major impact on both the efficiency and complexity of practical realizations. Motivated by this, we compare two dynamic strategies that operate at different levels of granularity. The first strategy, which we call network selection, requires little operational complexity and dynamically assigns an arriving application data transfer to the network that delivers the highest expected performance. Our second strategy, which we call traffic-splitting, is of higher complexity and aims to optimally split individual data transfers among the available networks. To this end, we (1) develop quantitative models that describe the performance of both strategies, (2) determine the (near-)optimal algorithms for both strategies, and (3) validate the efficiency and practical usefulness of the algorithms via extensive network simulations and experiments in a real-life testbed environment. These experimental results show that the optimal strategies obtained from the theoretical models lead to extremely well-performing solutions in practical circumstances. Moreover, the results show that the splitting of data transfers, which is easy to embed in the network requiring no information on the number of flows in the system, leads to a much better performance compared to dynamic network selection

    Dynamic Control Flow in Large-Scale Machine Learning

    Full text link
    Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.Comment: Appeared in EuroSys 2018. 14 pages, 16 figure

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints
    corecore