3,190 research outputs found
Optimization on fixed low latency implementation of GBT protocol in FPGA
In the upgrade of ATLAS experiment, the front-end electronics components are
subjected to a large radiation background. Meanwhile high speed optical links
are required for the data transmission between the on-detector and off-detector
electronics. The GBT architecture and the Versatile Link (VL) project are
designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed
data transmission which is called GBT link. In the ATLAS upgrade, besides the
link with on-detector, the GBT link is also used between different off-detector
systems. The GBTX ASIC is designed for the on-detector front-end,
correspondingly for the off-detector electronics, the GBT architecture is
implemented in Field Programmable Gate Arrays (FPGA). CERN launches the
GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS
upgrade framework, the Front-End LInk eXchange (FELIX) system is used to
interface the front-end electronics of several ATLAS subsystems. The GBT link
is used between them, to transfer the detector data and the timing, trigger,
control and monitoring information. The trigger signal distributed in the
down-link from FELIX to the front-end requires a fixed and low latency. In this
paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve
a lower fixed latency. For FELIX, a common firmware will be used to interface
different front-ends with support of both GBT modes: the forward error
correction mode and the wide mode. The modified GBT-FPGA core has the ability
to switch between the GBT modes without FPGA reprogramming. The system clock
distribution of the multi-channel FELIX firmware is also discussed in this
paper
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
A standard hardware bottleneck when training deep neural networks is GPU
memory. The bulk of memory is occupied by caching intermediate tensors for
gradient computation in the backward pass. We propose a novel method to reduce
this footprint - Dropping Intermediate Tensors (DropIT). DropIT drops min-k
elements of the intermediate tensors and approximates gradients from the
sparsified tensors in the backward pass. Theoretically, DropIT reduces noise on
estimated gradients and therefore has a higher rate of convergence than
vanilla-SGD. Experiments show that we can drop up to 90% of the intermediate
tensor elements in fully-connected and convolutional layers while achieving
higher testing accuracy for Visual Transformers and Convolutional Neural
Networks on various tasks (e.g. classification, object detection).Our code and
models are available at https://github.com/chenjoya/dropitComment: 16 pages. DropIT can save memory & improve accuracy, providing a new
perspective of dropping in activation compressed training than quantizatio
Influence of initial microcracks on the dynamic mechanical characteristics of sandstone
publishedVersio
- …