118 research outputs found

    Sparse-SignSGD with Majority Vote for Communication-Efficient Distributed Learning

    Full text link
    The training efficiency of complex deep learning models can be significantly improved through the use of distributed optimization. However, this process is often hindered by a large amount of communication cost between workers and a parameter server during iterations. To address this bottleneck, in this paper, we present a new communication-efficient algorithm that offers the synergistic benefits of both sparsification and sign quantization, called S3{\sf S}^3GD-MV. The workers in S3{\sf S}^3GD-MV select the top-KK magnitude components of their local gradient vector and only send the signs of these components to the server. The server then aggregates the signs and returns the results via a majority vote rule. Our analysis shows that, under certain mild conditions, S3{\sf S}^3GD-MV can converge at the same rate as signSGD while significantly reducing communication costs, if the sparsification parameter KK is properly chosen based on the number of workers and the size of the deep learning model. Experimental results using both independent and identically distributed (IID) and non-IID datasets demonstrate that the S3{\sf S}^3GD-MV attains higher accuracy than signSGD, significantly reducing communication costs. These findings highlight the potential of S3{\sf S}^3GD-MV as a promising solution for communication-efficient distributed optimization in deep learning.Comment: 13 pages, 7 figure

    Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation

    Full text link
    We present a novel unsupervised domain adaptation method for semantic segmentation that generalizes a model trained with source images and corresponding ground-truth labels to a target domain. A key to domain adaptive semantic segmentation is to learn domain-invariant and discriminative features without target ground-truth labels. To this end, we propose a bi-directional pixel-prototype contrastive learning framework that minimizes intra-class variations of features for the same object class, while maximizing inter-class variations for different ones, regardless of domains. Specifically, our framework aligns pixel-level features and a prototype of the same object class in target and source images (i.e., positive pairs), respectively, sets them apart for different classes (i.e., negative pairs), and performs the alignment and separation processes toward the other direction with pixel-level features in the source image and a prototype in the target image. The cross-domain matching encourages domain-invariant feature representations, while the bidirectional pixel-prototype correspondences aggregate features for the same object class, providing discriminative features. To establish training pairs for contrastive learning, we propose to generate dynamic pseudo labels of target images using a non-parametric label transfer, that is, pixel-prototype correspondences across different domains. We also present a calibration method compensating class-wise domain biases of prototypes gradually during training.Comment: Accepted to ECCV 202

    FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection

    Full text link
    Rotation-equivariance is an essential yet challenging property in oriented object detection. While general object detectors naturally leverage robustness to spatial shifts due to the translation-equivariance of the conventional CNNs, achieving rotation-equivariance remains an elusive goal. Current detectors deploy various alignment techniques to derive rotation-invariant features, but still rely on high capacity models and heavy data augmentation with all possible rotations. In this paper, we introduce a Fully Rotation-Equivariant Oriented Object Detector (FRED), whose entire process from the image to the bounding box prediction is strictly equivariant. Specifically, we decouple the invariant task (object classification) and the equivariant task (object localization) to achieve end-to-end equivariance. We represent the bounding box as a set of rotation-equivariant vectors to implement rotation-equivariant localization. Moreover, we utilized these rotation-equivariant vectors as offsets in the deformable convolution, thereby enhancing the existing advantages of spatial adaptation. Leveraging full rotation-equivariance, our FRED demonstrates higher robustness to image-level rotation compared to existing methods. Furthermore, we show that FRED is one step closer to non-axis aligned learning through our experiments. Compared to state-of-the-art methods, our proposed method delivers comparable performance on DOTA-v1.0 and outperforms by 1.5 mAP on DOTA-v1.5, all while significantly reducing the model parameters to 16%.Comment: Accepted to the 38th Annual AAAI Conference on Artificial Intelligence (AAAI24),Vancouver, British Columbia, 202

    Strength can be controlled by edge dislocations in refractory high-entropy alloys

    Get PDF
    Energy efficiency is motivating the search for new high-temperature (high-T) metals. Some new body-centered-cubic (BCC) random multicomponent “high-entropy alloys (HEAs)” based on refractory elements (Cr-Mo-Nb-Ta-V-W-Hf-Ti-Zr) possess exceptional strengths at high temperatures but the physical origins of this outstanding behavior are not known. Here we show, using integrated in-situ neutron-diffraction (ND), high-resolution transmission electron microscopy (HRTEM), and recent theory, that the high strength and strength retention of a NbTaTiV alloy and a high-strength/low-density CrMoNbV alloy are attributable to edge dislocations. This finding is surprising because plastic flows in BCC elemental metals and dilute alloys are generally controlled by screw dislocations. We use the insight and theory to perform a computationally-guided search over 10(7) BCC HEAs and identify over 10(6) possible ultra-strong high-T alloy compositions for future exploration

    Beyond 5G URLLC Evolution: New Service Modes and Practical Considerations

    Full text link
    Ultra-reliable low latency communications (URLLC) arose to serve industrial IoT (IIoT) use cases within the 5G. Currently, it has inherent limitations to support future services. Based on state-of-the-art research and practical deployment experience, in this article, we introduce and advocate for three variants: broadband, scalable and extreme URLLC. We discuss use cases and key performance indicators and identify technology enablers for the new service modes. We bring practical considerations from the IIoT testbed and provide an outlook toward some new research directions.Comment: Submitted to IEEE Wireless Commun. Ma
    • …
    corecore