14,960 research outputs found
Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models
Unsupervised learning has been widely used in many real-world applications.
One of the simplest and most important unsupervised learning models is the
Gaussian mixture model (GMM). In this work, we study the multi-task learning
problem on GMMs, which aims to leverage potentially similar GMM parameter
structures among tasks to obtain improved learning performance compared to
single-task learning. We propose a multi-task GMM learning procedure based on
the EM algorithm that not only can effectively utilize unknown similarity
between related tasks but is also robust against a fraction of outlier tasks
from arbitrary sources. The proposed procedure is shown to achieve minimax
optimal rate of convergence for both parameter estimation error and the excess
mis-clustering error, in a wide range of regimes. Moreover, we generalize our
approach to tackle the problem of transfer learning for GMMs, where similar
theoretical results are derived. Finally, we demonstrate the effectiveness of
our methods through simulations and a real data analysis. To the best of our
knowledge, this is the first work studying multi-task and transfer learning on
GMMs with theoretical guarantees.Comment: 149 pages, 7 figures, 2 table
Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness
Representation multi-task learning (MTL) and transfer learning (TL) have
achieved tremendous success in practice. However, the theoretical understanding
of these methods is still lacking. Most existing theoretical works focus on
cases where all tasks share the same representation, and claim that MTL and TL
almost always improve performance. However, as the number of tasks grow,
assuming all tasks share the same representation is unrealistic. Also, this
does not always match empirical findings, which suggest that a shared
representation may not necessarily improve single-task or target-only learning
performance. In this paper, we aim to understand how to learn from tasks with
\textit{similar but not exactly the same} linear representations, while dealing
with outlier tasks. We propose two algorithms that are \textit{adaptive} to the
similarity structure and \textit{robust} to outlier tasks under both MTL and TL
settings. Our algorithms outperform single-task or target-only learning when
representations across tasks are sufficiently similar and the fraction of
outlier tasks is small. Furthermore, they always perform no worse than
single-task learning or target-only learning, even when the representations are
dissimilar. We provide information-theoretic lower bounds to show that our
algorithms are nearly \textit{minimax} optimal in a large regime.Comment: 60 pages, 5 figure
Unsupervised Federated Learning: A Federated Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness against Adversarial Attacks
While supervised federated learning approaches have enjoyed significant
success, the domain of unsupervised federated learning remains relatively
underexplored. In this paper, we introduce a novel federated gradient EM
algorithm designed for the unsupervised learning of mixture models with
heterogeneous mixture proportions across tasks. We begin with a comprehensive
finite-sample theory that holds for general mixture models, then apply this
general theory on Gaussian Mixture Models (GMMs) and Mixture of Regressions
(MoRs) to characterize the explicit estimation error of model parameters and
mixture proportions. Our proposed federated gradient EM algorithm demonstrates
several key advantages: adaptability to unknown task similarity, resilience
against adversarial attacks on a small fraction of data sources, protection of
local data privacy, and computational and communication efficiency.Comment: 43 pages, 1 figur
Effect of Samarium doping on the nucleation of fcc-Aluminum in undercooled liquids
The effect of Sm doping on the fcc-Al nucleation was investigated in Al-Sm
liquids with low Sm concentrations (xSm) with molecular dynamics simulations.
The nucleation in the moderately undercooled liquid is achieved by the recently
developed persistent-embryo method. Systematically computing the nucleation
rate with different xSm (xSm=0%, 1%, 2%, 3%, 5%) at 700 K, we found Sm dopant
reduces the nucleation rate by up to 25 orders of magnitudes with only 5%
doping concentration. This effect is mostly associated with the increase in the
free energy barrier with a minor contribution from suppression of the
attachment to the nucleus caused by Sm doping.Comment: 4 figure
Fair or Not: Effects of Gamification Elements on Crowdsourcing Participation
Fairness perceptions have been found to be a critical driving factor for solvers’ engagement in crowdsourcing. However, the literature still lacks on how to design crowdsourcing platform to enhance solvers’ fairness perceptions. By integrating organizational justice theory with the gamification literature, we conceptualize solvers’ perceptions of two typical gamification elements: the point-rewarding perception and the feedback-giving perception. We develop model to explain the effects of gamification perceptions on both distributive and interpersonal justice perceptions, which are conducive to solvers’ participation. Based on a survey of 295 solvers, we apply the partial least squares-structural equation modeling (PLS-SEM) approach to test the research model. Results show that both point-rewarding perception and feedback-giving perception can enhance the distributive and interpersonal justice perceptions which, in turn, foster solvers’ crowdsourcing participation. Theoretical contributions and practical implications are discussed
- …