46 research outputs found
Averaging Rate Scheduler for Decentralized Learning on Heterogeneous Data
State-of-the-art decentralized learning algorithms typically require the data
distribution to be Independent and Identically Distributed (IID). However, in
practical scenarios, the data distribution across the agents can have
significant heterogeneity. In this work, we propose averaging rate scheduling
as a simple yet effective way to reduce the impact of heterogeneity in
decentralized learning. Our experiments illustrate the superiority of the
proposed method (~3% improvement in test accuracy) compared to the conventional
approach of employing a constant averaging rate.Comment: 9 pages, 3 figures, 4 tables. arXiv admin note: text overlap with
arXiv:2305.0479
Neighborhood Gradient Clustering: An Efficient Decentralized Learning Method for Non-IID Data Distributions
Decentralized learning algorithms enable the training of deep learning models
over large distributed datasets generated at different devices and locations,
without the need for a central server. In practical scenarios, the distributed
datasets can have significantly different data distributions across the agents.
The current state-of-the-art decentralized algorithms mostly assume the data
distributions to be Independent and Identically Distributed (IID). This paper
focuses on improving decentralized learning over non-IID data distributions
with minimal compute and memory overheads. We propose Neighborhood Gradient
Clustering (NGC), a novel decentralized learning algorithm that modifies the
local gradients of each agent using self- and cross-gradient information. In
particular, the proposed method replaces the local gradients of the model with
the weighted mean of the self-gradients, model-variant cross-gradients
(derivatives of the received neighbors' model parameters with respect to the
local dataset), and data-variant cross-gradients (derivatives of the local
model with respect to its neighbors' datasets). Further, we present CompNGC, a
compressed version of NGC that reduces the communication overhead by by compressing the cross-gradients. We demonstrate the empirical
convergence and efficiency of the proposed technique over non-IID data
distributions sampled from the CIFAR-10 dataset on various model architectures
and graph topologies. Our experiments demonstrate that NGC and CompNGC
outperform the existing state-of-the-art (SoTA) decentralized learning
algorithm over non-IID data by with significantly less compute and
memory requirements. Further, we also show that the proposed NGC method
outperforms the baseline by with no additional communication.Comment: 15 pages, 5 figures, 7 table
Homogenizing Non-IID datasets via In-Distribution Knowledge Distillation for Decentralized Learning
Decentralized learning enables serverless training of deep neural networks
(DNNs) in a distributed manner on multiple nodes. This allows for the use of
large datasets, as well as the ability to train with a wide variety of data
sources. However, one of the key challenges with decentralized learning is
heterogeneity in the data distribution across the nodes. In this paper, we
propose In-Distribution Knowledge Distillation (IDKD) to address the challenge
of heterogeneous data distribution. The goal of IDKD is to homogenize the data
distribution across the nodes. While such data homogenization can be achieved
by exchanging data among the nodes sacrificing privacy, IDKD achieves the same
objective using a common public dataset across nodes without breaking the
privacy constraint. This public dataset is different from the training dataset
and is used to distill the knowledge from each node and communicate it to its
neighbors through the generated labels. With traditional knowledge
distillation, the generalization of the distilled model is reduced because all
the public dataset samples are used irrespective of their similarity to the
local dataset. Thus, we introduce an Out-of-Distribution (OoD) detector at each
node to label a subset of the public dataset that maps close to the local
training data distribution. Finally, only labels corresponding to these subsets
are exchanged among the nodes and with appropriate label averaging each node is
finetuned on these data subsets along with its local data. Our experiments on
multiple image classification datasets and graph topologies show that the
proposed IDKD scheme is more effective than traditional knowledge distillation
and achieves state-of-the-art generalization performance on heterogeneously
distributed data with minimal communication overhead
CoDeC: Communication-Efficient Decentralized Continual Learning
Training at the edge utilizes continuously evolving data generated at
different locations. Privacy concerns prohibit the co-location of this
spatially as well as temporally distributed data, deeming it crucial to design
training algorithms that enable efficient continual learning over decentralized
private data. Decentralized learning allows serverless training with spatially
distributed data. A fundamental barrier in such distributed learning is the
high bandwidth cost of communicating model updates between agents. Moreover,
existing works under this training paradigm are not inherently suitable for
learning a temporal sequence of tasks while retaining the previously acquired
knowledge. In this work, we propose CoDeC, a novel communication-efficient
decentralized continual learning algorithm which addresses these challenges. We
mitigate catastrophic forgetting while learning a task sequence in a
decentralized learning setup by combining orthogonal gradient projection with
gossip averaging across decentralized agents. Further, CoDeC includes a novel
lossless communication compression scheme based on the gradient subspaces. We
express layer-wise gradients as a linear combination of the basis vectors of
these gradient subspaces and communicate the associated coefficients. We
theoretically analyze the convergence rate for our algorithm and demonstrate
through an extensive set of experiments that CoDeC successfully learns
distributed continual tasks with minimal forgetting. The proposed compression
scheme results in up to 4.8x reduction in communication costs with
iso-performance as the full communication baseline
Computational Fluid Dynamic study on the effect of near gravity material on dense medium cyclone treating coal using Discrete Phase Model and Algebraic Slip mixture multiphase model
In this paper, the effect of near gravity material at desired separation density during the coal washing is studied. It is believed that the Dense Medium Separation of coal particles in the presence of high percentage of near gravity material, results in a significant misplacement of coal particles to wrong products. However the performance of dense medium cyclone does not merely depend on the total amount of near gravity materials but also on their distribution as well as on their quality. This paper deals with numerical simulation of magnetite medium segregation and coal partitioning handled in a 350 mm dense medium cyclone. Volume of Fluid coupled with Reynolds Stress Model is used to resolve the two-phase air-core and turbulence. Algebraic Slip mixture multiphase model with the granular options are considered to predict magnetite medium segregation. Medium segregation results are validated against Gamma Ray Tomography measurements. Further, Discrete Phase Model is used to track the coal particles. Residence Time Distribution of different size and density coal particles are also estimated using Discrete Phase Model. Additionally, Algebraic Slip mixture model is also utilised to simulate magnetite and coal particle segregation at different near gravity material proportions. Discrepancies in the coal particle behaviour at different near gravity material content are explained using locus of zero vertical velocities, mixture density, coal volume fractions
Soil Biological Activity Contributing to Phosphorus Availability in Vertisols under Long-Term Organic and Conventional Agricultural Management
Mobilization of unavailable phosphorus (P) to plant available P is a prerequisite to sustain crop productivity. Although most of the agricultural soils have sufficient amounts of phosphorus, low availability of native soil P remains a key limiting factor to increasing crop productivity. Solubilization and mineralization of applied and native P to plant available form is mediated through a number of biological and biochemical processes that are strongly influenced by soil carbon/organic matter, besides other biotic and abiotic factors. Soils rich in organic matter are expected to have higher P availability potentially due to higher biological activity. In conventional agricultural systems mineral fertilizers are used to supply P for plant growth, whereas organic systems largely rely on inputs of organic origin. The soils under organic management are supposed to be biologically more active and thus possess a higher capability to mobilize native or applied P. In this study we compared biological activity in soil of a long-term farming systems comparison field trial in vertisols under a subtropical (semi-arid) environment. Soil samples were collected from plots under 7 years of organic and conventional management at five different time points in soybean (Glycine max) -wheat (Triticum aestivum) crop sequence including the crop growth stages of reproductive significance. Upon analysis of various soil biological properties such as dehydrogenase, β-glucosidase, acid and alkaline phosphatase activities, microbial respiration, substrate induced respiration, soil microbial biomass carbon, organically managed soils were found to be biologically more active particularly at R2 stage in soybean and panicle initiation stage in wheat. We also determined the synergies between these biological parameters by using the methodology of principle component analysis. At all sampling points, P availability in organic and conventional systems was comparable. Our findings clearly indicate that owing to higher biological activity, organic systems possess equal capabilities of supplying P for crop growth as are conventional systems with inputs of mineral P fertilizers
MULTIDRUG RESISTANT TUBERCULOSIS IN CHILDREN IN THE DEMOCRATIC REPUBLIC OF CONGO: FIRST EXPERIENCE WITH A SHORT TREATMENT COURSE IN A UNIVERSITY HOSPITAL
Background: A short treatment course for multidrug-resistant tuberculosis (MR-TB) is not yet well codified in children in the Democratic Republic of Congo (DRC). The objective of this study was to evaluate a short MR-TB treatment course in children. Methods: A prospective study was performed from April 2015 (corresponding to the inclusion) through April 2017 (and the later initiation time point was April 2016) in the University Clinics of Kinshasa. Enrolled children were aged 0 to 15 years. The treatment duration was in general for 9 months, with 4 months of intensive phase treatment with Kanamycin, Levofloxacin, Isoniazid, Pyrazinamide, Prothionamide, Clofazimine and Ethambutol, and 5 months of continuous phase treatment with Levofloxacin, Pyrazinamide, Clofazimine and Ethambutol. Frequencies were reported for significant results. Results: A total of 21 children had MDR-TB (11 males and 10 females). Fifteen (71.43%) were bacteriological confirmed cases (by Xpert/MTB), and 6 (28.57%) were clinically diagnosed (MDR-TB contact with suggestive radiologic lesions); 2 patients were coinfected with HIV, 15 cases had pulmonary TB, and 6 had extrapulmonary TB. The main radiologic findings included TB cavity (3 cases), pleural effusion (5 cases), alveolar syndrome (8 cases), adenopathy (7 cases), and interstitial infiltration, fibrosis and miliary (2 cases each). The Ziehl control was negative before 4 months of treatment in the majority of the cases. Overall, 11 patients were cured, 7 completed the treatment, 2 died and 1 was lost to follow up. Conclusions: MDR-TB remains a challenge in children. A more comfortable, short treatment course is feasible in children in the DRC. It is necessary to verify this observation with a larger cohort of MDR-TB patients in pediatrics. Keywords: Multidrug-resistant tuberculosis; children; short treatment course; Africa; Kinshasa; treatment outcomes