2,890 research outputs found

    Reveal flocking of birds flying in fog by machine learning

    Full text link
    We study the first-order flocking transition of birds flying in low-visibility conditions by employing three different representative types of neural network (NN) based machine learning architectures that are trained via either an unsupervised learning approach called "learning by confusion" or a widely used supervised learning approach. We find that after the training via either the unsupervised learning approach or the supervised learning one, all of these three different representative types of NNs, namely, the fully-connected NN, the convolutional NN, and the residual NN, are able to successfully identify the first-order flocking transition point of this nonequilibrium many-body system. This indicates that NN based machine learning can be employed as a promising generic tool to investigate rich physics in scenarios associated to first-order phase transitions and nonequilibrium many-body systems.Comment: 7 pages, 3 figure

    Recursive Generalization Transformer for Image Super-Resolution

    Full text link
    Transformer architectures have exhibited remarkable performance in image super-resolution (SR). Since the quadratic computational complexity of the self-attention (SA) in Transformer, existing methods tend to adopt SA in a local region to reduce overheads. However, the local design restricts the global context exploitation, which is crucial for accurate image reconstruction. In this work, we propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images. Specifically, we propose the recursive-generalization self-attention (RG-SA). It recursively aggregates input features into representative feature maps, and then utilizes cross-attention to extract global information. Meanwhile, the channel dimensions of attention matrices (query, key, and value) are further scaled to mitigate the redundancy in the channel domain. Furthermore, we combine the RG-SA with local self-attention to enhance the exploitation of the global context, and propose the hybrid adaptive integration (HAI) for module integration. The HAI allows the direct and effective fusion between features at different levels (local or global). Extensive experiments demonstrate that our RGT outperforms recent state-of-the-art methods quantitatively and qualitatively. Code is released at https://github.com/zhengchen1999/RGT.Comment: Code is available at https://github.com/zhengchen1999/RG

    Loss Scaling and Step Size in Deep Learning Optimizatio

    Get PDF
    Deep learning training consumes ever-increasing time and resources, and that isdue to the complexity of the model, the number of updates taken to reach goodresults, and both the amount and dimensionality of the data. In this dissertation,we will focus on making the process of training more efficient by focusing on thestep size to reduce the number of computations for parameters in each update.We achieved our objective in two new ways: we use loss scaling as a proxy forthe learning rate, and we use learnable layer-wise optimizers. Although our workis perhaps not the first to point to the equivalence of loss scaling and learningrate in deep learning optimization, ours is the first to leveraging this relationshiptowards more efficient training. We did not only use it in simple gradient descent,but also we were able to extend it to other adaptive algorithms. Finally, we usemetalearning to shed light on relevant aspects, including learnable lossesand optimizers. In this regard, we developed a novel learnable optimizer andeffectively utilized it to acquire an adaptive rescaling factor and learning rate,resulting in a significant reduction in required memory during training
    • …
    corecore