20 research outputs found
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
Recurrent Neural Networks (RNNs) are powerful sequence modeling tools.
However, when dealing with high dimensional inputs, the training of RNNs
becomes computational expensive due to the large number of model parameters.
This hinders RNNs from solving many important computer vision tasks, such as
Action Recognition in Videos and Image Captioning. To overcome this problem, we
propose a compact and flexible structure, namely Block-Term tensor
decomposition, which greatly reduces the parameters of RNNs and improves their
training efficiency. Compared with alternative low-rank approximations, such as
tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only
more concise (when using the same rank), but also able to attain a better
approximation to the original RNNs with much fewer parameters. On three
challenging tasks, including Action Recognition in Videos, Image Captioning and
Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of
both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes
17,388 times fewer parameters than the standard LSTM to achieve an accuracy
improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.Comment: CVPR201
Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition
Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term
Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved
promising performance in sequential data modeling. The hidden layers in RNNs
can be regarded as the memory units, which are helpful in storing information
in sequential contexts. However, when dealing with high dimensional input data,
such as video and text, the input-to-hidden linear transformation in RNNs
brings high memory usage and huge computational cost. This makes the training
of RNNs unscalable and difficult. To address this challenge, we propose a novel
compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring
decomposition (TRD) to reformulate the input-to-hidden transformation. Compared
with other tensor decomposition methods, TR-LSTM is more stable. In addition,
TR-LSTM can complete an end-to-end training and also provide a fundamental
building block for RNNs in handling large input data. Experiments on real-world
action recognition datasets have demonstrated the promising performance of the
proposed TR-LSTM compared with the tensor train LSTM and other state-of-the-art
competitors.Comment: 9 page
Adversarial Noise Layer: Regularize Neural Network By Adding Noise
In this paper, we introduce a novel regularization method called Adversarial
Noise Layer (ANL) and its efficient version called Class Adversarial Noise
Layer (CANL), which are able to significantly improve CNN's generalization
ability by adding carefully crafted noise into the intermediate layer
activations. ANL and CANL can be easily implemented and integrated with most of
the mainstream CNN-based models. We compared the effects of the different types
of noise and visually demonstrate that our proposed adversarial noise instruct
CNN models to learn to extract cleaner feature maps, which further reduce the
risk of over-fitting. We also conclude that models trained with ANL or CANL are
more robust to the adversarial examples generated by FGSM than the traditional
adversarial training approaches
Superneurons: dynamic GPU memory management for training deep neural networks
© 2018 ACM. Going deeper and wider in neural architectures improves their accuracy, while the limited GPU DRAM places an undesired restriction on the network design domain. Deep Learning (DL) practitioners either need to change to less desired network architectures, or nontrivially dissect a network across multiGPUs. These distract DL practitioners from concentrating on their original machine learning tasks. We present SuperNeurons: a dynamic GPU memory scheduling runtime to enable the network training far beyond the GPU DRAM capacity. SuperNeurons features 3 memory optimizations, Liveness Analysis, Unified Tensor Pool, and Cost-Aware Recomputation; together they effectively reduce the network-wide peak memory usage down to the maximal memory usage among layers. We also address the performance issues in these memory-saving techniques. Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance. Evaluations against Caffe, Torch, MXNet and TensorFlow have demonstrated that SuperNeurons trains at least 3.2432 deeper network than current ones with the leading performance. Particularly, SuperNeurons can train ResNet2500 that has 104 basic network layers on a 12GB K40c
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios
CEPC Conceptual Design Report: Volume 2 - Physics & Detector
The Circular Electron Positron Collider (CEPC) is a large international scientific facility proposed by the Chinese particle physics community to explore the Higgs boson and provide critical tests of the underlying fundamental physics principles of the Standard Model that might reveal new physics. The CEPC, to be hosted in China in a circular underground tunnel of approximately 100 km in circumference, is designed to operate as a Higgs factory producing electron-positron collisions with a center-of-mass energy of 240 GeV. The collider will also operate at around 91.2 GeV, as a Z factory, and at the WW production threshold (around 160 GeV). The CEPC will produce close to one trillion Z bosons, 100 million W bosons and over one million Higgs bosons. The vast amount of bottom quarks, charm quarks and tau-leptons produced in the decays of the Z bosons also makes the CEPC an effective B-factory and tau-charm factory. The CEPC will have two interaction points where two large detectors will be located. This document is the second volume of the CEPC Conceptual Design Report (CDR). It presents the physics case for the CEPC, describes conceptual designs of possible detectors and their technological options, highlights the expected detector and physics performance, and discusses future plans for detector R&D and physics investigations. The final CEPC detectors will be proposed and built by international collaborations but they are likely to be composed of the detector technologies included in the conceptual designs described in this document. A separate volume, Volume I, recently released, describes the design of the CEPC accelerator complex, its associated civil engineering, and strategic alternative scenarios