10,114 research outputs found

    Global Momentum Compression for Sparse Communication in Distributed SGD

    Full text link
    With the rapid growth of data, distributed stochastic gradient descent~(DSGD) has been widely used for solving large-scale machine learning problems. Due to the latency and limited bandwidth of network, communication has become the bottleneck of DSGD when we need to train large scale models, like deep neural networks. Communication compression with sparsified gradient, abbreviated as \emph{sparse communication}, has been widely used for reducing communication cost in DSGD. Recently, there has appeared one method, called deep gradient compression~(DGC), to combine memory gradient and momentum SGD for sparse communication. DGC has achieved promising performance in practise. However, the theory about the convergence of DGC is lack. In this paper, we propose a novel method, called \emph{\underline{g}}lobal \emph{\underline{m}}omentum \emph{\underline{c}}ompression~(GMC), for sparse communication in DSGD. GMC also combines memory gradient and momentum SGD. But different from DGC which adopts local momentum, GMC adopts global momentum. We theoretically prove the convergence rate of GMC for both convex and non-convex problems. To the best of our knowledge, this is the first work that proves the convergence of distributed momentum SGD~(DMSGD) with sparse communication and memory gradient. Empirical results show that, compared with the DMSGD counterpart without sparse communication, GMC can reduce the communication cost by approximately 100 fold without loss of generalization accuracy. GMC can also achieve comparable~(sometimes better) performance compared with DGC, with extra theoretical guarantee

    The Effects of a Non-Ferroelectric Slab on the Polarization and the Susceptibility of the Ferroelectric Multilayer

    Full text link
    The polarization and the susceptibility of a ferroelectric multilayer with a non-ferroelectric slab are investigated within the framework of transverse Ising model with a four-spin interaction term. The effect of the thickness and the position of the non-ferroelectric slab are investigated in this paper. We find that the increase of the thickness of the non-ferroelectric will decrease the polarization and the susceptibility of the film. If the position of the non-ferroelcetric slab shifts from the center of the film to the surface, the number of the peaks of the susceptibility will change. And a step-like polarization curve is found.Comment: 15 pages, 4 figure

    Stability studies of ZnO and AlN thin film acoustic wave devices in acid and alkali harsh environments

    Get PDF
    Surface acoustic wave (SAW) devices based on piezoelectric thin-films such as ZnO and AlN are widely used in sensing, microfluidics and lab-on-a-chip applications. However, for many of these applications, the SAW devices will inevitably be used in acid or alkali harsh environments, which may cause their early failures. In this work, we investigated the behavior and degradation mechanisms of thin film based SAW devices in acid and alkali harsh environments. Results show that under the acid and alkali attacks, chemical reaction and corrosion of ZnO devices are very fast (usually within 45 s). During the corrosion, the crystalline orientation of the ZnO film is not changed, but its grain defects are significantly increased and the grain sizes are decreased. The velocity of ZnO-based SAW devices is decreased due to the formation of porous structures induced by the chemical reactions. Whereas an AlN thin-film based SAW device does not perform well in acid–alkali conditions, it might be able to maintain a normal performance without obvious degradation for more than ten hours in acid or alkali solutions. This work could provide guidance for the applications of both ZnO or AlN-based SAW devices in acid/alkali harsh environments

    Contrastive Attention for Automatic Chest X-ray Report Generation

    Full text link
    Recently, chest X-ray report generation, which aims to automatically generate descriptions of given chest X-ray images, has received growing research interests. The key challenge of chest X-ray report generation is to accurately capture and describe the abnormal regions. In most cases, the normal regions dominate the entire chest X-ray image, and the corresponding descriptions of these normal regions dominate the final report. Due to such data bias, learning-based models may fail to attend to abnormal regions. In this work, to effectively capture and describe abnormal regions, we propose the Contrastive Attention (CA) model. Instead of solely focusing on the current input image, the CA model compares the current input image with normal images to distill the contrastive information. The acquired contrastive information can better represent the visual features of abnormal regions. According to the experiments on the public IU-X-ray and MIMIC-CXR datasets, incorporating our CA into several existing models can boost their performance across most metrics. In addition, according to the analysis, the CA model can help existing models better attend to the abnormal regions and provide more accurate descriptions which are crucial for an interpretable diagnosis. Specifically, we achieve the state-of-the-art results on the two public datasets.Comment: Appear in Findings of ACL 2021 (The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)
    • …
    corecore