61,613 research outputs found

    SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

    Full text link
    Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware. We first demonstrate that our framework achieves double the throughput of a standard distributed hyperparameter optimization framework by optimizing SVM for MNIST using 150 distributed workers. We then conduct model search with SHADHO over the course of one week using 74 GPUs across two compute clusters to optimize U-Net for a cell segmentation task, discovering 515 models that achieve a lower validation loss than standard U-Net.Comment: 10 pages, 6 figure

    Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

    Get PDF
    Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can be used for arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

    Fast emergency paths schema to overcome transient link failures in ospf routing

    Full text link
    A reliable network infrastructure must be able to sustain traffic flows, even when a failure occurs and changes the network topology. During the occurrence of a failure, routing protocols, like OSPF, take from hundreds of milliseconds to various seconds in order to converge. During this convergence period, packets might traverse a longer path or even a loop. An even worse transient behaviour is that packets are dropped even though destinations are reachable. In this context, this paper describes a proactive fast rerouting approach, named Fast Emergency Paths Schema (FEP-S), to overcome problems originating from transient link failures in OSPF routing. Extensive experiments were done using several network topologies with different dimensionality degrees. Results show that the recovery paths, obtained by FEPS, are shorter than those from other rerouting approaches and can improve the network reliability by reducing the packet loss rate during the routing protocols convergence caused by a failure.Comment: 18 page

    High-Performance low-vcc in-order core

    Get PDF
    Power density grows in new technology nodes, thus requiring Vcc to scale especially in mobile platforms where energy is critical. This paper presents a novel approach to decrease Vcc while keeping operating frequency high. Our mechanism is referred to as immediate read after write (IRAW) avoidance. We propose an implementation of the mechanism for an Intel® SilverthorneTM in-order core. Furthermore, we show that our mechanism can be adapted dynamically to provide the highest performance and lowest energy-delay product (EDP) at each Vcc level. Results show that IRAW avoidance increases operating frequency by 57% at 500mV and 99% at 400mV with negligible area and power overhead (below 1%), which translates into large speedups (48% at 500mV and 90% at 400mV) and EDP reductions (0.61 EDP at 500mV and 0.33 at 400mV).Peer ReviewedPostprint (published version

    What lies beneath? The role of informal and hidden networks in the management of crises

    Get PDF
    Crisis management research traditionally focuses on the role of formal communication networks in the escalation and management of organisational crises. Here, we consider instead informal and unobservable networks. The paper explores how hidden informal exchanges can impact upon organisational decision-making and performance, particularly around inter-agency working, as knowledge distributed across organisations and shared between organisations is often shared through informal means and not captured effectively through the formal decision-making processes. Early warnings and weak signals about potential risks and crises are therefore often missed. We consider the implications of these dynamics in terms of crisis avoidance and crisis management
    corecore