112 research outputs found

    Sparse-SignSGD with Majority Vote for Communication-Efficient Distributed Learning

    Full text link
    The training efficiency of complex deep learning models can be significantly improved through the use of distributed optimization. However, this process is often hindered by a large amount of communication cost between workers and a parameter server during iterations. To address this bottleneck, in this paper, we present a new communication-efficient algorithm that offers the synergistic benefits of both sparsification and sign quantization, called S3{\sf S}^3GD-MV. The workers in S3{\sf S}^3GD-MV select the top-KK magnitude components of their local gradient vector and only send the signs of these components to the server. The server then aggregates the signs and returns the results via a majority vote rule. Our analysis shows that, under certain mild conditions, S3{\sf S}^3GD-MV can converge at the same rate as signSGD while significantly reducing communication costs, if the sparsification parameter KK is properly chosen based on the number of workers and the size of the deep learning model. Experimental results using both independent and identically distributed (IID) and non-IID datasets demonstrate that the S3{\sf S}^3GD-MV attains higher accuracy than signSGD, significantly reducing communication costs. These findings highlight the potential of S3{\sf S}^3GD-MV as a promising solution for communication-efficient distributed optimization in deep learning.Comment: 13 pages, 7 figure

    The role of scoring in formative assessment of second language writing

    Get PDF
    This study examines how scoring with feedback in formative assessment affects learning in an English as a foreign language (EFL) writing classroom. Two EFL writing classes were compared: in one class, teacher feedback was given to students on initial drafts, and scores were given only at the end of the semester; in the second class, teacher feedback and scores were given to students on each draft throughout the semester. This study adopted a mixed-methods approach, including a statistical analysis to explore whether teacher feedback accompanied by scoring makes a difference in student writing, and observation, and interviews of focal students to examine how feedback with scores affects students’ perceptions and attitudes towards writing. The results reveal that the scoring class wrote more accurately than the non-scoring class and that the focal students in the scoring class were not only more aware of both their own and their classmates’ performances, but that they also made efforts to emulate the students they considered effective writers. This study implies that scoring can fortify the effects of feedback by motivating high achieving students to do their best in their writing assignments

    Evaluation of transport policy and energy demand in Seoul metropolitan region using LEAP model

    Get PDF
    Thesis(Master) --KDI School:Master of Development Policy,2015masterpublishedChanho Park

    Posterior Distillation Sampling

    Full text link
    We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle various parametric images, have mainly focused on generation. Unlike generation, editing requires a balance between conforming to the target attribute and preserving the identity of the source content. Recent 2D image editing methods have achieved this balance by leveraging the stochastic latent encoded in the generative process of diffusion models. To extend the editing capabilities of diffusion models shown in pixel space to parameter space, we reformulate the 2D image editing method into an optimization form named PDS. PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity. We demonstrate that this optimization resembles running a generative process with the target attribute, but aligning this process with the trajectory of the source's generative process. Extensive editing results in Neural Radiance Fields and Scalable Vector Graphics representations demonstrate that PDS is capable of sampling targets to fulfill the aforementioned balance across various parameter spaces.Comment: Project page: https://posterior-distillation-sampling.github.io

    Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation

    Full text link
    We present a novel unsupervised domain adaptation method for semantic segmentation that generalizes a model trained with source images and corresponding ground-truth labels to a target domain. A key to domain adaptive semantic segmentation is to learn domain-invariant and discriminative features without target ground-truth labels. To this end, we propose a bi-directional pixel-prototype contrastive learning framework that minimizes intra-class variations of features for the same object class, while maximizing inter-class variations for different ones, regardless of domains. Specifically, our framework aligns pixel-level features and a prototype of the same object class in target and source images (i.e., positive pairs), respectively, sets them apart for different classes (i.e., negative pairs), and performs the alignment and separation processes toward the other direction with pixel-level features in the source image and a prototype in the target image. The cross-domain matching encourages domain-invariant feature representations, while the bidirectional pixel-prototype correspondences aggregate features for the same object class, providing discriminative features. To establish training pairs for contrastive learning, we propose to generate dynamic pseudo labels of target images using a non-parametric label transfer, that is, pixel-prototype correspondences across different domains. We also present a calibration method compensating class-wise domain biases of prototypes gradually during training.Comment: Accepted to ECCV 202

    Learning to Discriminate Information for Online Action Detection

    Full text link
    From a streaming video, online action detection aims to identify actions in the present. For this task, previous methods use recurrent networks to model the temporal sequence of current action frames. However, these methods overlook the fact that an input image sequence includes background and irrelevant actions as well as the action of interest. For online action detection, in this paper, we propose a novel recurrent unit to explicitly discriminate the information relevant to an ongoing action from others. Our unit, named Information Discrimination Unit (IDU), decides whether to accumulate input information based on its relevance to the current action. This enables our recurrent network with IDU to learn a more discriminative representation for identifying ongoing actions. In experiments on two benchmark datasets, TVSeries and THUMOS-14, the proposed method outperforms state-of-the-art methods by a significant margin. Moreover, we demonstrate the effectiveness of our recurrent unit by conducting comprehensive ablation studies.Comment: To appear in CVPR 202

    A broadband X-ray study of the Rabbit pulsar wind nebula powered by PSR J1418-6058

    Full text link
    We report on broadband X-ray properties of the Rabbit pulsar wind nebula (PWN) associated with the pulsar PSR J1418-6058 using archival Chandra and XMM-Newton data, and a new NuSTAR observation. NuSTAR data above 10 keV allowed us to detect the 110-ms spin period of the pulsar, characterize its hard X-ray pulse profile, and resolve hard X-ray emission from the PWN after removing contamination from the pulsar and other overlapping point sources. The extended PWN was detected up to ∼\sim20 keV and is well described by a power-law model with a photon index Γ≈\Gamma\approx2. The PWN shape does not vary significantly with energy, and its X-ray spectrum shows no clear evidence of softening away from the pulsar. We modeled the spatial profile of X-ray spectra and broadband spectral energy distribution in the radio to TeV band to infer the physical properties of the PWN. We found that a model with low magnetic field strength (B∼10B\sim 10 μ\muG) and efficient diffusion (D∼1027D\sim 10^{27} cm2^2 s−1^{-1}) fits the PWN data well. The extended hard X-ray and TeV emission, associated respectively with synchrotron radiation and inverse Compton scattering by relativistic electrons, suggests that particles are accelerated to very high energies (≳500\gtrsim500 TeV), indicating that the Rabbit PWN is a Galactic PeVatron candidate.Comment: 21 pages, 10 figures. ApJ accepte

    Data Diversification Analysis on Data Preprocessing

    Get PDF
    A statistical analysis to examine the diversity distribution resulting from two different approaches: The first one, the standard approach, is a baseline augmentation approach where a random augmentation is applied to each sample in each epoch independently; The second one, the random batch approach, is another new augmentation approach designed where a random augmentation is applied to each tiny-batch in each epoch independently, and which samples are in the same tiny-batch is random and independent across all epochs

    X-ray studies of the pulsar PSR J1420-6048 and its TeV pulsar wind nebula in the Kookaburra region

    Full text link
    We present a detailed analysis of broadband X-ray observations of the pulsar PSR J1420-6048 and its wind nebula (PWN) in the Kookaburra region with Chandra, XMM-Newton, and NuSTAR. Using the archival XMM-Newton and new NuSTAR data, we detected 68 ms pulsations of the pulsar and characterized its X-ray pulse profile which exhibits a sharp spike and a broad bump separated by ~0.5 in phase. A high-resolution Chandra image revealed a complex morphology of the PWN: a torus-jet structure, a few knots around the torus, one long (~7') and two short tails extending in the northwest direction, and a bright diffuse emission region to the south. Spatially integrated Chandra and NuSTAR spectra of the PWN out to 2.5' are well described by a power law model with a photon index Γ≈{\Gamma} {\approx} 2. A spatially resolved spectroscopic study, as well as NuSTAR radial profiles of the 3--7 keV and 7--20 keV brightness, showed a hint of spectral softening with increasing distance from the pulsar. A multi-wavelength spectral energy distribution (SED) of the source was then obtained by supplementing our X-ray measurements with published radio, Fermi-LAT, and H.E.S.S. data. The SED and radial variations of the X-ray spectrum were fit with a leptonic multi-zone emission model. Our detailed study of the PWN may be suggestive of (1) particle transport dominated by advection, (2) a low magnetic-field strength (B ~ 5μ{\mu}G), and (3) electron acceleration to ~PeV energies.Comment: 18 pages and 8 figures. Accepted for publication in Ap
    • …
    corecore