1 research outputs found
Providing Accurate Models across Private Partitioned Data: Secure Maximum Likelihood Estimation
This paper focuses on the privacy paradigm of providing access to researchers
to remotely carry out analyses on sensitive data stored behind firewalls. We
address the situation where the analysis demands data from multiple physically
separate databases which cannot be combined. Motivating this problem are
analyses using multiple data sources that currently are only possible through
extension work creating a trusted user network. We develop and demonstrate a
method for accurate calculation of the multivariate normal likelihood equation,
for a set of parameters given the partitioned data, which can then be maximized
to obtain estimates. These estimates are achieved without sharing any data or
any true intermediate statistics of the data across firewalls. We show that
under a certain set of assumptions our method for estimation across these
partitions achieves identical results as estimation with the full data. Privacy
is maintained by adding noise at each partition. This ensures each party
receives noisy statistics, such that the noise cannot be removed until the last
step to obtain a single value, the true total log-likelihood. Potential
applications include all methods utilizing parameter estimation through
maximizing the multivariate normal likelihood equation. We give detailed
algorithms, along with available software, and both a real data example and
simulations estimating structural equation models (SEMs) with partitioned data