266 research outputs found

    Byzantine-Resilient Federated PCA and Low Rank Matrix Recovery

    Full text link
    In this work we consider the problem of estimating the principal subspace (span of the top r singular vectors) of a symmetric matrix in a federated setting, when each node has access to estimates of this matrix. We study how to make this problem Byzantine resilient. We introduce a novel provably Byzantine-resilient, communication-efficient, and private algorithm, called Subspace-Median, to solve it. We also study the most natural solution for this problem, a geometric median based modification of the federated power method, and explain why it is not useful. We consider two special cases of the resilient subspace estimation meta-problem - federated principal components analysis (PCA) and the spectral initialization step of horizontally federated low rank column-wise sensing (LRCCS) in this work. For both these problems we show how Subspace Median provides a resilient solution that is also communication-efficient. Median of Means extensions are developed for both problems. Extensive simulation experiments are used to corroborate our theoretical guarantees. Our second contribution is a complete AltGDmin based algorithm for Byzantine-resilient horizontally federated LRCCS and guarantees for it. We do this by developing a geometric median of means estimator for aggregating the partial gradients computed at each node, and using Subspace Median for initialization

    Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

    Full text link
    Modern ML applications increasingly rely on complex deep learning models and large datasets. There has been an exponential growth in the amount of computation needed to train the largest models. Therefore, to scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model. However, a distributed setup is prone to Byzantine failures of individual nodes, components, and software. With data augmentation added to these settings, there is a critical need for robust and efficient aggregation systems. We define the quality of workers as reconstruction ratios ∈(0,1]\in (0,1], and formulate aggregation as a Maximum Likelihood Estimation procedure using Beta densities. We show that the Regularized form of log-likelihood wrt subspace can be approximately solved using iterative least squares solver, and provide convergence guarantees using recent Convex Optimization landscape results. Our empirical findings demonstrate that our approach significantly enhances the robustness of state-of-the-art Byzantine resilient aggregators. We evaluate our method in a distributed setup with a parameter server, and show simultaneous improvements in communication efficiency and accuracy across various tasks. The code is publicly available at https://github.com/hamidralmasi/FlagAggregato

    Federated Over-Air Subspace Tracking from Incomplete and Corrupted Data

    Get PDF
    Subspace tracking (ST) with missing data (ST-miss) or outliers (Robust ST) or both (Robust ST-miss) has been extensively studied in the last many years. This work provides a new simple algorithm and guarantee for both ST with missing data (ST-miss) and RST-miss. Unlike past work on this topic, the algorithm is much simpler (uses fewer parameters) and the guarantee does not make the artificial assumption of piecewise constant subspace change, although it still handles that setting. Secondly, we extend our approach and its analysis to provably solving these problems when the raw data is federated and when the over-air data communication modality is used for information exchange between the KK peer nodes and the center.Comment: New model, algorithm for centralized case; added algorithms to deal with sparse outliers; modified organization significantl
    • …
    corecore