In medical research, individual-level patient data provide invaluable
information, but the patients' right to confidentiality remains of utmost
priority. This poses a huge challenge when estimating statistical models such
as linear mixed models, which is an extension of linear regression models that
can account for potential heterogeneity whenever data come from different data
providers. Federated learning algorithms tackle this hurdle by estimating
parameters without retrieving individual-level data. Instead, iterative
communication of parameter estimate updates between the data providers and
analyst is required. In this paper, we propose an alternative framework to
federated learning algorithms for fitting linear mixed models. Specifically,
our approach only requires the mean, covariance, and sample size of multiple
covariates from different data providers once. Using the principle of
statistical sufficiency within the framework of likelihood as theoretical
support, this proposed framework achieves estimates identical to those derived
from actual individual-level data. We demonstrate this approach through real
data on 15 068 patient records from 70 clinics at the Children's Hospital of
Pennsylvania (CHOP). Assuming that each clinic only shares summary statistics
once, we model the COVID-19 PCR test cycle threshold as a function of patient
information. Simplicity, communication efficiency, and wider scope of
implementation in any statistical software distinguish our approach from
existing strategies in the literature