We present a rigorous methodology for auditing differentially private machine
learning algorithms by adding multiple carefully designed examples called
canaries. We take a first principles approach based on three key components.
First, we introduce Lifted Differential Privacy (LiDP) that expands the
definition of differential privacy to handle randomized datasets. This gives us
the freedom to design randomized canaries. Second, we audit LiDP by trying to
distinguish between the model trained with K canaries versus K−1 canaries
in the dataset, leaving one canary out. By drawing the canaries i.i.d., LiDP
can leverage the symmetry in the design and reuse each privately trained model
to run multiple statistical tests, one for each canary. Third, we introduce
novel confidence intervals that take advantage of the multiple test statistics
by adapting to the empirical higher-order correlations. Together, this new
recipe demonstrates significant improvements in sample complexity, both
theoretically and empirically, using synthetic and real data. Further, recent
advances in designing stronger canaries can be readily incorporated into the
new framework