4 research outputs found

    FedMGDA+: Federated Learning meets Multi-objective Optimization

    Full text link
    Federated learning has emerged as a promising, massively distributed way to train a joint deep model over large amounts of edge devices while keeping private user data strictly on device. In this work, motivated from ensuring fairness among users and robustness against malicious adversaries, we formulate federated learning as multi-objective optimization and propose a new algorithm FedMGDA+ that is guaranteed to converge to Pareto stationary solutions. FedMGDA+ is simple to implement, has fewer hyperparameters to tune, and refrains from sacrificing the performance of any participating user. We establish the convergence properties of FedMGDA+ and point out its connections to existing approaches. Extensive experiments on a variety of datasets confirm that FedMGDA+ compares favorably against state-of-the-art.Comment: 26 pages, 9 figures; initial draft, comments welcome

    Efficient inference and sampling methods for probabilistic models

    No full text
    Probabilistic modeling is the main practical approach for designing systems that learn from data, and has numerous applications in different fields of science and engineering. Given a probabilistic model of a system and its associated parameters, general queries of interest about the system can be answered by performing inference on the distribution. Considering the importance of inference methods for machine learning, our goal in this dissertation is to develop efficient approximate inference methods for some practical problems. For this purpose, the following two problems are considered: First we consider the problem of non-intrusive home-appliance load monitoring which can be modeled as factorial hidden Markov model (FHMM). FHMMs have applications in several fields such as voice recognition, home appliance monitoring, computer vision, and fault detection. By employing FHMMs, the load-disaggregation problem becomes an FHMM inference problem described by a non-convex quadratic integer optimization problem. In this thesis we present a computationally efficient method that finds an approximate solution to this problem by combining convex semidefinite relaxations, randomized rounding, as well as a scalable ADMM method. Simulation results from both synthetic and real-world datasets demonstrate the superiority of our method compared to the previous state-of-the-art solutions. Markov chain Monte Carlo (MCMC) methods are widely used in machine learning. One of the major problems with MCMC is the question of how to design chains that mix fast over the whole state space; in particular, how to select the parameters of an MCMC algorithm. In this thesis we take a different approach and, similarly to parallel MCMC methods, instead of trying to find a single chain that samples from the whole distribution, we combine samples from several chains run in parallel, each exploring only parts of the state space. The chains are prioritized based on the kernel Stein discrepancy. The samples from the independent chains are combined using a novel technique for estimating the probability of different regions of the sample space. Experimental results demonstrate that the proposed algorithm provides significant speed-ups in different sampling problems.Open Acces
    corecore