46 research outputs found

    Averaging Rate Scheduler for Decentralized Learning on Heterogeneous Data

    Full text link
    State-of-the-art decentralized learning algorithms typically require the data distribution to be Independent and Identically Distributed (IID). However, in practical scenarios, the data distribution across the agents can have significant heterogeneity. In this work, we propose averaging rate scheduling as a simple yet effective way to reduce the impact of heterogeneity in decentralized learning. Our experiments illustrate the superiority of the proposed method (~3% improvement in test accuracy) compared to the conventional approach of employing a constant averaging rate.Comment: 9 pages, 3 figures, 4 tables. arXiv admin note: text overlap with arXiv:2305.0479

    Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions

    Full text link
    Decentralized learning algorithms enable the training of deep learning models over large distributed datasets generated at different devices and locations, without the need for a central server. In practical scenarios, the distributed datasets can have significantly different data distributions across the agents. The current state-of-the-art decentralized algorithms mostly assume the data distributions to be Independent and Identically Distributed (IID). This paper focuses on improving decentralized learning over non-IID data distributions with minimal compute and memory overheads. We propose Neighborhood Gradient Clustering (NGC), a novel decentralized learning algorithm that modifies the local gradients of each agent using self- and cross-gradient information. In particular, the proposed method replaces the local gradients of the model with the weighted mean of the self-gradients, model-variant cross-gradients (derivatives of the received neighbors' model parameters with respect to the local dataset), and data-variant cross-gradients (derivatives of the local model with respect to its neighbors' datasets). Further, we present CompNGC, a compressed version of NGC that reduces the communication overhead by 32×32 \times by compressing the cross-gradients. We demonstrate the empirical convergence and efficiency of the proposed technique over non-IID data distributions sampled from the CIFAR-10 dataset on various model architectures and graph topologies. Our experiments demonstrate that NGC and CompNGC outperform the existing state-of-the-art (SoTA) decentralized learning algorithm over non-IID data by 15%1-5\% with significantly less compute and memory requirements. Further, we also show that the proposed NGC method outperforms the baseline by 540%5-40\% with no additional communication.Comment: 15 pages, 5 figures, 7 table

    Homogenizing Non-IID datasets via In-Distribution Knowledge Distillation for Decentralized Learning

    Full text link
    Decentralized learning enables serverless training of deep neural networks (DNNs) in a distributed manner on multiple nodes. This allows for the use of large datasets, as well as the ability to train with a wide variety of data sources. However, one of the key challenges with decentralized learning is heterogeneity in the data distribution across the nodes. In this paper, we propose In-Distribution Knowledge Distillation (IDKD) to address the challenge of heterogeneous data distribution. The goal of IDKD is to homogenize the data distribution across the nodes. While such data homogenization can be achieved by exchanging data among the nodes sacrificing privacy, IDKD achieves the same objective using a common public dataset across nodes without breaking the privacy constraint. This public dataset is different from the training dataset and is used to distill the knowledge from each node and communicate it to its neighbors through the generated labels. With traditional knowledge distillation, the generalization of the distilled model is reduced because all the public dataset samples are used irrespective of their similarity to the local dataset. Thus, we introduce an Out-of-Distribution (OoD) detector at each node to label a subset of the public dataset that maps close to the local training data distribution. Finally, only labels corresponding to these subsets are exchanged among the nodes and with appropriate label averaging each node is finetuned on these data subsets along with its local data. Our experiments on multiple image classification datasets and graph topologies show that the proposed IDKD scheme is more effective than traditional knowledge distillation and achieves state-of-the-art generalization performance on heterogeneously distributed data with minimal communication overhead

    CoDeC: Communication-Efficient Decentralized Continual Learning

    Full text link
    Training at the edge utilizes continuously evolving data generated at different locations. Privacy concerns prohibit the co-location of this spatially as well as temporally distributed data, deeming it crucial to design training algorithms that enable efficient continual learning over decentralized private data. Decentralized learning allows serverless training with spatially distributed data. A fundamental barrier in such distributed learning is the high bandwidth cost of communicating model updates between agents. Moreover, existing works under this training paradigm are not inherently suitable for learning a temporal sequence of tasks while retaining the previously acquired knowledge. In this work, we propose CoDeC, a novel communication-efficient decentralized continual learning algorithm which addresses these challenges. We mitigate catastrophic forgetting while learning a task sequence in a decentralized learning setup by combining orthogonal gradient projection with gossip averaging across decentralized agents. Further, CoDeC includes a novel lossless communication compression scheme based on the gradient subspaces. We express layer-wise gradients as a linear combination of the basis vectors of these gradient subspaces and communicate the associated coefficients. We theoretically analyze the convergence rate for our algorithm and demonstrate through an extensive set of experiments that CoDeC successfully learns distributed continual tasks with minimal forgetting. The proposed compression scheme results in up to 4.8x reduction in communication costs with iso-performance as the full communication baseline

    Computational Fluid Dynamic study on the effect of near gravity material on dense medium cyclone treating coal using Discrete Phase Model and Algebraic Slip mixture multiphase model

    Get PDF
    In this paper, the effect of near gravity material at desired separation density during the coal washing is studied. It is believed that the Dense Medium Separation of coal particles in the presence of high percentage of near gravity material, results in a significant misplacement of coal particles to wrong products. However the performance of dense medium cyclone does not merely depend on the total amount of near gravity materials but also on their distribution as well as on their quality. This paper deals with numerical simulation of magnetite medium segregation and coal partitioning handled in a 350 mm dense medium cyclone. Volume of Fluid coupled with Reynolds Stress Model is used to resolve the two-phase air-core and turbulence. Algebraic Slip mixture multiphase model with the granular options are considered to predict magnetite medium segregation. Medium segregation results are validated against Gamma Ray Tomography measurements. Further, Discrete Phase Model is used to track the coal particles. Residence Time Distribution of different size and density coal particles are also estimated using Discrete Phase Model. Additionally, Algebraic Slip mixture model is also utilised to simulate magnetite and coal particle segregation at different near gravity material proportions. Discrepancies in the coal particle behaviour at different near gravity material content are explained using locus of zero vertical velocities, mixture density, coal volume fractions

    Soil Biological Activity Contributing to Phosphorus Availability in Vertisols under Long-Term Organic and Conventional Agricultural Management

    Get PDF
    Mobilization of unavailable phosphorus (P) to plant available P is a prerequisite to sustain crop productivity. Although most of the agricultural soils have sufficient amounts of phosphorus, low availability of native soil P remains a key limiting factor to increasing crop productivity. Solubilization and mineralization of applied and native P to plant available form is mediated through a number of biological and biochemical processes that are strongly influenced by soil carbon/organic matter, besides other biotic and abiotic factors. Soils rich in organic matter are expected to have higher P availability potentially due to higher biological activity. In conventional agricultural systems mineral fertilizers are used to supply P for plant growth, whereas organic systems largely rely on inputs of organic origin. The soils under organic management are supposed to be biologically more active and thus possess a higher capability to mobilize native or applied P. In this study we compared biological activity in soil of a long-term farming systems comparison field trial in vertisols under a subtropical (semi-arid) environment. Soil samples were collected from plots under 7 years of organic and conventional management at five different time points in soybean (Glycine max) -wheat (Triticum aestivum) crop sequence including the crop growth stages of reproductive significance. Upon analysis of various soil biological properties such as dehydrogenase, β-glucosidase, acid and alkaline phosphatase activities, microbial respiration, substrate induced respiration, soil microbial biomass carbon, organically managed soils were found to be biologically more active particularly at R2 stage in soybean and panicle initiation stage in wheat. We also determined the synergies between these biological parameters by using the methodology of principle component analysis. At all sampling points, P availability in organic and conventional systems was comparable. Our findings clearly indicate that owing to higher biological activity, organic systems possess equal capabilities of supplying P for crop growth as are conventional systems with inputs of mineral P fertilizers

    MULTIDRUG RESISTANT TUBERCULOSIS IN CHILDREN IN THE DEMOCRATIC REPUBLIC OF CONGO: FIRST EXPERIENCE WITH A SHORT TREATMENT COURSE IN A UNIVERSITY HOSPITAL

    Get PDF
    Background: A short treatment course for multidrug-resistant tuberculosis (MR-TB) is not yet well codified in children in the Democratic Republic of Congo (DRC). The objective of this study was to evaluate a short MR-TB treatment course in children. Methods: A prospective study was performed from April 2015 (corresponding to the inclusion) through April 2017 (and the later initiation time point was April 2016) in the University Clinics of Kinshasa. Enrolled children were aged 0 to 15 years. The treatment duration was in general for 9 months, with 4 months of intensive phase treatment with Kanamycin, Levofloxacin, Isoniazid, Pyrazinamide, Prothionamide, Clofazimine and Ethambutol, and 5 months of continuous phase treatment with Levofloxacin, Pyrazinamide, Clofazimine and Ethambutol. Frequencies were reported for significant results. Results: A total of 21 children had MDR-TB (11 males and 10 females). Fifteen (71.43%) were bacteriological confirmed cases (by Xpert/MTB), and 6 (28.57%) were clinically diagnosed (MDR-TB contact with suggestive radiologic lesions); 2 patients were coinfected with HIV, 15 cases had pulmonary TB, and 6 had extrapulmonary TB. The main radiologic findings included TB cavity (3 cases), pleural effusion (5 cases), alveolar syndrome (8 cases), adenopathy (7 cases), and interstitial infiltration, fibrosis and miliary (2 cases each). The Ziehl control was negative before 4 months of treatment in the majority of the cases. Overall, 11 patients were cured, 7 completed the treatment, 2 died and 1 was lost to follow up. Conclusions: MDR-TB remains a challenge in children. A more comfortable, short treatment course is feasible in children in the DRC. It is necessary to verify this observation with a larger cohort of MDR-TB patients in pediatrics. Keywords: Multidrug-resistant tuberculosis; children; short treatment course; Africa; Kinshasa; treatment outcomes
    corecore