277,580 research outputs found
Neural Networks Architecture Evaluation in a Quantum Computer
In this work, we propose a quantum algorithm to evaluate neural networks
architectures named Quantum Neural Network Architecture Evaluation (QNNAE). The
proposed algorithm is based on a quantum associative memory and the learning
algorithm for artificial neural networks. Unlike conventional algorithms for
evaluating neural network architectures, QNNAE does not depend on
initialization of weights. The proposed algorithm has a binary output and
results in 0 with probability proportional to the performance of the network.
And its computational cost is equal to the computational cost to train a neural
network
An Energy and Performance Exploration of Network-on-Chip Architectures
In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs
Mitigating Architectural Mismatch During the Evolutionary Synthesis of Deep Neural Networks
Evolutionary deep intelligence has recently shown great promise for producing
small, powerful deep neural network models via the organic synthesis of
increasingly efficient architectures over successive generations. Existing
evolutionary synthesis processes, however, have allowed the mating of parent
networks independent of architectural alignment, resulting in a mismatch of
network structures. We present a preliminary study into the effects of
architectural alignment during evolutionary synthesis using a gene tagging
system. Surprisingly, the network architectures synthesized using the gene
tagging approach resulted in slower decreases in performance accuracy and
storage size; however, the resultant networks were comparable in size and
performance accuracy to the non-gene tagging networks. Furthermore, we
speculate that there is a noticeable decrease in network variability for
networks synthesized with gene tagging, indicating that enforcing a
like-with-like mating policy potentially restricts the exploration of the
search space of possible network architectures.Comment: 5 page
SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
Going deeper and wider in neural architectures improves the accuracy, while
the limited GPU DRAM places an undesired restriction on the network design
domain. Deep Learning (DL) practitioners either need change to less desired
network architectures, or nontrivially dissect a network across multiGPUs.
These distract DL practitioners from concentrating on their original machine
learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling
runtime to enable the network training far beyond the GPU DRAM capacity.
SuperNeurons features 3 memory optimizations, \textit{Liveness Analysis},
\textit{Unified Tensor Pool}, and \textit{Cost-Aware Recomputation}, all
together they effectively reduce the network-wide peak memory usage down to the
maximal memory usage among layers. We also address the performance issues in
those memory saving techniques. Given the limited GPU DRAM, SuperNeurons not
only provisions the necessary memory for the training, but also dynamically
allocates the memory for convolution workspaces to achieve the high
performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have
demonstrated that SuperNeurons trains at least 3.2432 deeper network than
current ones with the leading performance. Particularly, SuperNeurons can train
ResNet2500 that has basic network layers on a 12GB K40c.Comment: PPoPP '2018: 23nd ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programmin
Semantic Video CNNs through Representation Warping
In this work, we propose a technique to convert CNN models for semantic
segmentation of static images into CNNs for video data. We describe a warping
method that can be used to augment existing architectures with very little
extra computational cost. This module is called NetWarp and we demonstrate its
use for a range of network architectures. The main design principle is to use
optical flow of adjacent frames for warping internal network representations
across time. A key insight of this work is that fast optical flow methods can
be combined with many different CNN architectures for improved performance and
end-to-end training. Experiments validate that the proposed approach incurs
only little extra computational cost, while improving performance, when video
streams are available. We achieve new state-of-the-art results on the CamVid
and Cityscapes benchmark datasets and show consistent improvements over
different baseline networks. Our code and models will be available at
http://segmentation.is.tue.mpg.deComment: ICCV 201
- âŠ