61,613 research outputs found
SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Computer vision is experiencing an AI renaissance, in which machine learning
models are expediting important breakthroughs in academic research and
commercial applications. Effectively training these models, however, is not
trivial due in part to hyperparameters: user-configured values that control a
model's ability to learn from data. Existing hyperparameter optimization
methods are highly parallel but make no effort to balance the search across
heterogeneous hardware or to prioritize searching high-impact spaces. In this
paper, we introduce a framework for massively Scalable Hardware-Aware
Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the
relative complexity of each search space and monitors performance on the
learning task over all trials. These metrics are then used as heuristics to
assign hyperparameters to distributed workers based on their hardware. We first
demonstrate that our framework achieves double the throughput of a standard
distributed hyperparameter optimization framework by optimizing SVM for MNIST
using 150 distributed workers. We then conduct model search with SHADHO over
the course of one week using 74 GPUs across two compute clusters to optimize
U-Net for a cell segmentation task, discovering 515 models that achieve a lower
validation loss than standard U-Net.Comment: 10 pages, 6 figure
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
Deep neural networks have achieved impressive results in computer vision and
machine learning. Unfortunately, state-of-the-art networks are extremely
compute and memory intensive which makes them unsuitable for mW-devices such as
IoT end-nodes. Aggressive quantization of these networks dramatically reduces
the computation and memory footprint. Binary-weight neural networks (BWNs)
follow this trend, pushing weight quantization to the limit. Hardware
accelerators for BWNs presented up to now have focused on core efficiency,
disregarding I/O bandwidth and system-level efficiency that are crucial for
deployment of accelerators in ultra-low power devices. We present Hyperdrive: a
BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel
binary-weight streaming approach, which can be used for arbitrarily sized
convolutional neural network architecture and input resolution by exploiting
the natural scalability of the compute units both at chip-level and
system-level by arranging Hyperdrive chips systolically in a 2D mesh while
processing the entire feature map together in parallel. Hyperdrive achieves 4.3
TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than
state-of-the-art BWN accelerators, even if its core uses resource-intensive
FP16 arithmetic for increased robustness
Fast emergency paths schema to overcome transient link failures in ospf routing
A reliable network infrastructure must be able to sustain traffic flows, even
when a failure occurs and changes the network topology. During the occurrence
of a failure, routing protocols, like OSPF, take from hundreds of milliseconds
to various seconds in order to converge. During this convergence period,
packets might traverse a longer path or even a loop. An even worse transient
behaviour is that packets are dropped even though destinations are reachable.
In this context, this paper describes a proactive fast rerouting approach,
named Fast Emergency Paths Schema (FEP-S), to overcome problems originating
from transient link failures in OSPF routing. Extensive experiments were done
using several network topologies with different dimensionality degrees. Results
show that the recovery paths, obtained by FEPS, are shorter than those from
other rerouting approaches and can improve the network reliability by reducing
the packet loss rate during the routing protocols convergence caused by a
failure.Comment: 18 page
High-Performance low-vcc in-order core
Power density grows in new technology nodes, thus requiring Vcc to scale especially in mobile platforms where energy is critical. This paper presents a novel approach to decrease Vcc while keeping operating frequency high. Our mechanism is referred to as immediate read after write (IRAW) avoidance. We propose an implementation of the mechanism for an Intel® SilverthorneTM in-order core. Furthermore, we show that our mechanism can be adapted dynamically to provide the highest performance and lowest energy-delay product (EDP) at each Vcc level. Results show that IRAW avoidance increases operating frequency by 57% at 500mV and 99% at 400mV with negligible area and power overhead (below 1%), which translates into large speedups (48% at 500mV and 90% at 400mV) and EDP reductions (0.61 EDP at 500mV and 0.33 at 400mV).Peer ReviewedPostprint (published version
What lies beneath? The role of informal and hidden networks in the management of crises
Crisis management research traditionally focuses on the role of formal communication networks in the escalation and management of organisational crises. Here, we consider instead informal and unobservable networks. The paper explores how hidden informal exchanges can impact upon organisational decision-making and performance, particularly around inter-agency working, as knowledge distributed across organisations and shared between organisations is often shared through informal means and not captured effectively through the formal decision-making processes. Early warnings and weak signals about potential risks and crises are therefore often missed. We consider the implications of these dynamics in terms of crisis avoidance and crisis management
- …