14,960 research outputs found

    Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

    Full text link
    Unsupervised learning has been widely used in many real-world applications. One of the simplest and most important unsupervised learning models is the Gaussian mixture model (GMM). In this work, we study the multi-task learning problem on GMMs, which aims to leverage potentially similar GMM parameter structures among tasks to obtain improved learning performance compared to single-task learning. We propose a multi-task GMM learning procedure based on the EM algorithm that not only can effectively utilize unknown similarity between related tasks but is also robust against a fraction of outlier tasks from arbitrary sources. The proposed procedure is shown to achieve minimax optimal rate of convergence for both parameter estimation error and the excess mis-clustering error, in a wide range of regimes. Moreover, we generalize our approach to tackle the problem of transfer learning for GMMs, where similar theoretical results are derived. Finally, we demonstrate the effectiveness of our methods through simulations and a real data analysis. To the best of our knowledge, this is the first work studying multi-task and transfer learning on GMMs with theoretical guarantees.Comment: 149 pages, 7 figures, 2 table

    Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness

    Full text link
    Representation multi-task learning (MTL) and transfer learning (TL) have achieved tremendous success in practice. However, the theoretical understanding of these methods is still lacking. Most existing theoretical works focus on cases where all tasks share the same representation, and claim that MTL and TL almost always improve performance. However, as the number of tasks grow, assuming all tasks share the same representation is unrealistic. Also, this does not always match empirical findings, which suggest that a shared representation may not necessarily improve single-task or target-only learning performance. In this paper, we aim to understand how to learn from tasks with \textit{similar but not exactly the same} linear representations, while dealing with outlier tasks. We propose two algorithms that are \textit{adaptive} to the similarity structure and \textit{robust} to outlier tasks under both MTL and TL settings. Our algorithms outperform single-task or target-only learning when representations across tasks are sufficiently similar and the fraction of outlier tasks is small. Furthermore, they always perform no worse than single-task learning or target-only learning, even when the representations are dissimilar. We provide information-theoretic lower bounds to show that our algorithms are nearly \textit{minimax} optimal in a large regime.Comment: 60 pages, 5 figure

    Unsupervised Federated Learning: A Federated Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness against Adversarial Attacks

    Full text link
    While supervised federated learning approaches have enjoyed significant success, the domain of unsupervised federated learning remains relatively underexplored. In this paper, we introduce a novel federated gradient EM algorithm designed for the unsupervised learning of mixture models with heterogeneous mixture proportions across tasks. We begin with a comprehensive finite-sample theory that holds for general mixture models, then apply this general theory on Gaussian Mixture Models (GMMs) and Mixture of Regressions (MoRs) to characterize the explicit estimation error of model parameters and mixture proportions. Our proposed federated gradient EM algorithm demonstrates several key advantages: adaptability to unknown task similarity, resilience against adversarial attacks on a small fraction of data sources, protection of local data privacy, and computational and communication efficiency.Comment: 43 pages, 1 figur

    Effect of Samarium doping on the nucleation of fcc-Aluminum in undercooled liquids

    Get PDF
    The effect of Sm doping on the fcc-Al nucleation was investigated in Al-Sm liquids with low Sm concentrations (xSm) with molecular dynamics simulations. The nucleation in the moderately undercooled liquid is achieved by the recently developed persistent-embryo method. Systematically computing the nucleation rate with different xSm (xSm=0%, 1%, 2%, 3%, 5%) at 700 K, we found Sm dopant reduces the nucleation rate by up to 25 orders of magnitudes with only 5% doping concentration. This effect is mostly associated with the increase in the free energy barrier with a minor contribution from suppression of the attachment to the nucleus caused by Sm doping.Comment: 4 figure

    Fair or Not: Effects of Gamification Elements on Crowdsourcing Participation

    Get PDF
    Fairness perceptions have been found to be a critical driving factor for solvers’ engagement in crowdsourcing. However, the literature still lacks on how to design crowdsourcing platform to enhance solvers’ fairness perceptions. By integrating organizational justice theory with the gamification literature, we conceptualize solvers’ perceptions of two typical gamification elements: the point-rewarding perception and the feedback-giving perception. We develop model to explain the effects of gamification perceptions on both distributive and interpersonal justice perceptions, which are conducive to solvers’ participation. Based on a survey of 295 solvers, we apply the partial least squares-structural equation modeling (PLS-SEM) approach to test the research model. Results show that both point-rewarding perception and feedback-giving perception can enhance the distributive and interpersonal justice perceptions which, in turn, foster solvers’ crowdsourcing participation. Theoretical contributions and practical implications are discussed
    • …
    corecore