2,890 research outputs found
Reveal flocking of birds flying in fog by machine learning
We study the first-order flocking transition of birds flying in
low-visibility conditions by employing three different representative types of
neural network (NN) based machine learning architectures that are trained via
either an unsupervised learning approach called "learning by confusion" or a
widely used supervised learning approach. We find that after the training via
either the unsupervised learning approach or the supervised learning one, all
of these three different representative types of NNs, namely, the
fully-connected NN, the convolutional NN, and the residual NN, are able to
successfully identify the first-order flocking transition point of this
nonequilibrium many-body system. This indicates that NN based machine learning
can be employed as a promising generic tool to investigate rich physics in
scenarios associated to first-order phase transitions and nonequilibrium
many-body systems.Comment: 7 pages, 3 figure
Recursive Generalization Transformer for Image Super-Resolution
Transformer architectures have exhibited remarkable performance in image
super-resolution (SR). Since the quadratic computational complexity of the
self-attention (SA) in Transformer, existing methods tend to adopt SA in a
local region to reduce overheads. However, the local design restricts the
global context exploitation, which is crucial for accurate image
reconstruction. In this work, we propose the Recursive Generalization
Transformer (RGT) for image SR, which can capture global spatial information
and is suitable for high-resolution images. Specifically, we propose the
recursive-generalization self-attention (RG-SA). It recursively aggregates
input features into representative feature maps, and then utilizes
cross-attention to extract global information. Meanwhile, the channel
dimensions of attention matrices (query, key, and value) are further scaled to
mitigate the redundancy in the channel domain. Furthermore, we combine the
RG-SA with local self-attention to enhance the exploitation of the global
context, and propose the hybrid adaptive integration (HAI) for module
integration. The HAI allows the direct and effective fusion between features at
different levels (local or global). Extensive experiments demonstrate that our
RGT outperforms recent state-of-the-art methods quantitatively and
qualitatively. Code is released at https://github.com/zhengchen1999/RGT.Comment: Code is available at https://github.com/zhengchen1999/RG
Loss Scaling and Step Size in Deep Learning Optimizatio
Deep learning training consumes ever-increasing time and resources, and that isdue to the complexity of the model, the number of updates taken to reach goodresults, and both the amount and dimensionality of the data. In this dissertation,we will focus on making the process of training more efficient by focusing on thestep size to reduce the number of computations for parameters in each update.We achieved our objective in two new ways: we use loss scaling as a proxy forthe learning rate, and we use learnable layer-wise optimizers. Although our workis perhaps not the first to point to the equivalence of loss scaling and learningrate in deep learning optimization, ours is the first to leveraging this relationshiptowards more efficient training. We did not only use it in simple gradient descent,but also we were able to extend it to other adaptive algorithms. Finally, we usemetalearning to shed light on relevant aspects, including learnable lossesand optimizers. In this regard, we developed a novel learnable optimizer andeffectively utilized it to acquire an adaptive rescaling factor and learning rate,resulting in a significant reduction in required memory during training
- …